This course focuses on probability in high dimensions with a view toward applications in the data sciences. Topics include: Concentration of measure as applied to random vectors and matrices in high dimensions, random graphs, community detection, covariance estimation and clustering, randomized dimensionality reduction, quadratic forms, symmetrization and contraction, stochastic processes, theory of empirical processes, chaining, and statistical learning.

Applications to machine learning, statistics and signal processing will be presented.

Time and Location: Monday and Wednesday, 10:30-11:50, VKC 256

Instructor: Larry Goldstein
Office Hours: Monday 12-1:30, Wednesday 9-10, KAP 406D

Main Text: High Dimensional Probability for Mathematicians and Data Scientists, by Roman Vershynin
Course content will be based on Chapters 2,3,4,5,7,8 and 10.

Some supplementary references:

Concentration Inequalities: A Nonasymptotic Theory of Independence
Stephane Boucheron, Gabor Lugosi and Pascal Massart, Oxford University Press.

The Concentration of measure phenomenon
Michel Ledoux

Motivation

Course grades will be determined by:

10% Course participation (full attendance is expected)
90% Homework

## Homework Assignment

Please place homework under Xioahan’s office door, EEB-514, before the due date.

Chapter 2, problems 2.5.1, 2.5.4 (property 5, not 4 was intended), 2.5.5, 2.5.7 (first show you may choose any other form of the norm here, due to the equivalence given by Proposition 2.5.2), 2.6.9, 2.7.2. Due 9/12
Chapter 3, problems 3.3.3, 3.3.5, 3.3.6, 3.5.3, 3.6.7, 3.7.5 and 3.7.6, due 9/26
Chapter 4, problems 4.2.10, 4.4.6, 4.5.2, 4.5.4, 4. 7.3 and 4.7.6, due 10/10
Chapters 5 and 6, problems 5.4.12, 5.4.15, 5.6.6, 6.1.6, 6.3.4, 6.3.5, 6.5.4, and 6.6.5 due 10/31
Chapters 7 and 8 problems 7.1.8, 7.2.4 (consider using Fubini’s Theorem, and see how the required assumptions on f differ between that approach and integration by parts), 7.6.6, 8.2.7, 8.4.6, 8.4.8 (you can take E(Y|X)=T(X)) due 11/28