Lecture Course (250108 VO): January 7 - 22, 2020
Time: Tuesday and Wednesday, 13:15 - 14:45
Start: Tuesday, January 7, 2020
End: Wednesday, January 22, 2020
Statistical learning theory plays a central role in modern data science, and the question we focus on in this course has been the key question in the area since the late 60s. To describe the problem, let F be a class of functions defined on a probability space (Ω, μ), and consider a random variable Y . The goal is to find some function that is almost as close to Y as the best approximation to Y in F .
Link to more detailed syllabus.
Question: Given a class F, a distribution (X,Y), and a sample size N, what is the optimal tradeoff between the wanted accuracy ε and the confidence 1 − δ? And, what is the right choice of fˆthat attains the optimal tradeoff?
(1) Why is learning possible? The definition of a learning problem; what can we hope for; the quadratic and multiplier processes; complexity measures of classes of functions.
(2) The small-ball method and (some of) its applications.
(3) Median-of-means tournaments and the solution for convex classes.
(4) Complexity measures of classes revisited: chaining methods for Bernoulli and gaussian processes; combinatorial dimension and metric entropy.
Prerequisites: The course will require the knowledge of (graduate level) probability/measure theory and functional analysis, as well as some mathematical maturity. Most of the material I will cover can be found in the course’s lecture notes. Because of the nature of the course, some of the details will be left for the students.
The aim of this course is to show that this question has a strong geometric flavour and to highlight some of the ideas in empirical processes theory and in asymptotic geometric analysis that have led to its solution—under minimal assumptions on the class F and on (X, Y).