With the recent advances in powerful computing and the availability of massive sets of data, the tools of statistics, data science, and analytics have become indispensable in the applied sciences and in industry. This course focuses on the mathematical underpinnings that provide the foundations to modern day data analysis and statistics.

The contents of the course may depend in part on the background, preparation, and interest of the students, thus making the list of topics below somewhat flexible.

  • Parametric models, linear models, variable selection, the lasso and matrix completion
  • Estimation, criteria and construction of estimators, maximum likelihood, asymptotics
  • Non parametric models, empirical distribution function, bootstrap
  • Hypothesis testing, multiple hypotheses testing, family wise error, false discovery rate
  • Density and regression estimation, regularization and smoothing
  • Classification, discriminant analysis, Vapnik Chervonenkis (VC) dimension

Course Prerequisite: Students should have at least one good course in probability, and some basic statistics. It will be assumed that students are familiar with the first five chapters of the course text, All of Statistics: A concise course in Statistical Inference, by Larry Wasserman. Students should review these chapters and study any material new to them before starting the course.

It is also strongly recommended that students read Chapter 6 of the textbook, which consists mostly of material that is covered in first year statistics courses (e.g., confidence intervals, testing hypotheses).

InstructorLarry Goldstein


Structure and Evaluation

Course participation, 35%, Final Exam, 65% (Tuesday August 13th 13:00-15:00)

Course Text

All of Statistics: A concise course in Statistical Inference, by Larry Wasserman.

Additional References

 

Links of Interest