Syllabus

Prerequisite

R or Python programing, Linear algebra

Basic knowledge of C language (or interfaces to it, such as Rcpp and Cython) is useful, but easy enough to pick up as needed.

Grading

Class participation, Homework, and (potentially) Project

Course topics (tentative)

Numerical stability & finite precision arithmetic through examples

  • Finite difference approximation of gradient
    • Example: unit-test your log-likelihood and gradient derivation — how close the numerical gradient can be to the exact one?
  • Dealing with ill-conditioned matrices
    • Example: Gaussian process regression, sparse regression, etc — what to do when Cholesky fails (SVD, preconditioning, or what?).
  • Intro to iterative methods & numerical issues to watch out for
    • Example: principal component analysis using inverse covariance matrix — re-orthogonalized Lanczos algorithm for finding the smallest eigenvalue / eigenvector.

Algorithm performance on modern hardware

  • Cost of arithmetic operations vs data access: why a count of floating point operations may be irrelevant
  • A bit of assembly code
  • Multi-tiered memory caches
  • Vectorization & data locality
  • Examples:
    • Performance comparison between different linear algebra libraries
    • Sparse vs dense matrix operations
    • Masking to avoid branching in for-loop
  • Parallel computing:
    • Shared memory vs. embarrassing parallelism
    • Instruction-level parallelism
    • Example: tale of laptop beating a cluster with 128 CPUs
  • Other issues: low precision computing, branch prediction, overheating, etc

Data structures in statistical computing

  • Sparse matrices: storage formats and their computational properties
    • Example: coordinate descent
  • Preprocessing design matrices
    • Example: Cox model, propensity score matched / stratified regression

Algorithms for large scale data (if time allows)

  • Stochastic gradient descent
    • Robbins-Monro algorithm
    • Momentum acceleration
  • Hamiltonian Monte Carlo

Statistical software designs (if enough interest from audience)

  • General purpose vs focused
    • Example: Stan vs glmnet (or BayesBridge)
  • Start small and keep it extensible / maintainable
    • Example: building Bayesian hierarchical model and Gibbs sampler one step at a time
  • Tuning algorithm parameters
    • Example: stochastic optimization of Metropolis acceptance rate
  • Cross-platform portability
  • User-friendly features

Essentials for statistical software developer / contributor (if enough interest from audience)

  • Software licensing
    • Permissive vs copy left / viral
  • Tools of the trade:
    • Visual diff
    • Navigating codebase: find and grep (in addition to good IDE)
    • Git with its full capability unleashed: rebase, log –follow, blame, tag, etc