This course focuses on probability in high dimensions with a view toward applications to data sciences, that is, to ‘big data.’ Topics include: Concentration of measure as applied to random vectors and matrices in high dimensions, random graphs, community detection, covariance estimation and clustering, randomized dimensionality reduction, symmetrization and contraction, stochastic processes, theory of empirical processes, chaining, statistical learning and sparse recovery problems.

Prerequisite: A rigorous course in probability theory at the Masters level or higher, some knowledge of stochastic processes, a good command of undergraduate linear algebra, and general familiarity with basic notions of metric, normed and Hilbert spaces and linear operators. Knowledge of measure theory is not essential. Students should be read and work the problems in the Preface and Chapter 1 of the textbook in preparation.

Instructors: Larry Goldstein and Yosi Rinott

Structure and Evaluation

  • Lectures Session July 2nd-13th 2018, Monday through Friday mornings, 9:45-10:45, 11:00-12:00
  • Lecture and Problem sessions Monday through Friday afternoons 3-5:15 PM
  • Midterm, 20% Saturday July 7th, afternoon session.
  • Final Exam, 30% Thursday July 12th, to be discussed Friday July 13th
  • Homework and course participation, 50%

Main Materials

Main Text: High Dimensional Probability for Mathematicians and Data Scientists, by Roman Vershynin [to download the book, select `Your Copy’ on the left side of the page]

The course will cover as many aspects of the following topics as time, and the student’s level of preparation, permit:

Chapter 2: Concentration of Sums of Independent Random Variables: Hoeffding, Chernoff and Bernstein inequalities, sub-Gaussian and sub-exponential distributions.

Chapter 3: Random Vectors in High Dimension: concentration of the norm, covariance matrices and principal component analysis, high dimensional distributions, sub-gaussian distributions in higher dimensions.

Chapter 4: Random Matrices: nets, covering and packing numbers, bounds on sub-Gaussian matrices, covariance estimation and clustering.

Chapter 5: Concentration of Lipschitz functions on the sphere, Johnson- Lindenstrauss theorem, matrix Bernstein inequality.

Chapter 6: Quadratic forms, symmetrization and contraction: decoupling, symmetrization, matrix completion, contraction principle.

Chapter 7: Random Processes: Basic concepts, Slepian’s inequality, bounds on Gaussian matrices, Sudakov’s minoration inequality, Gaussian width, random projections of sets.

Chapter 8: Chaining: Dudley’s inequality, empirical processes, VC dimension, statistical learning theory, chaining, Talagrand’s majorizing measure and comparison theorem.

Chapter 9: Deviations of random matrices and geometric consequences: Sub-gaussian increments of the random matrix process, matrix deviation inequality, bounds on random matrices and sizes of random projections, Johnson-Lindenstrauss Lemma for infinite sets, random sections: M∗ bound and escape theorem.

Chapter 10: Sparse Recovery: Recovery of sparse signals, low rank matrix recovery, the lasso.

Additional References

Some supplementary references:

Concentration Inequalities: A Nonasymptotic Theory of Independence” by Boucheron, Lugosi and Massart

The Concentration of measure phenomenon” by Michel Ledoux

“Mathematical Foundations of Infinite-Dimensional Statistical Models” by E. Giné and R. Nickl

Estimation in High Dimensions: a Geometric Perspective” by R. Vershynin

“Weak Convergence and Empirical Processes” by A. van der Vaart and Jon Wellner

“Probability in Banach Spaces” by M. Ledoux and M. Talagrand

“Statistics for High-Dimensional Data: Methods, Theory and Applications” by P. Bühlmann and S. van de Geer

The Elements of Statistical Learning” by T. Hastie. R. Tibshirani and Jerome Friedman

Understanding Machine Learning: From Theory to Algorithms’‘ by Shai Shalev-Shwartz and Shai Ben-David