Adaptivity of deep learning: Efficiency of function estimation and optimization guarantee from nonconvexity view-point

Taiji Suzuki (U Tokyo)

Jun 03. 2022, 10:00 — 10:45

In this talk, I discuss how deep learning can statistically outperform shallow methods such as kernel ridge regression. First, I will discuss the excess risk bounds of deep learning by considering some function classes such as Besov spaces and show that sparsity and non-convex geometry of the target function class play the essential role to characterize the superiority of deep learning. In particular, it is shown that deep learning can attain better performances for high-dimensional (or infinite-dimensional) inputs. In the latter half, I discuss optimization of neural networks and its impact on statistical performances. I will consider some optimization methods in a mean-field regime based on gradient Langevin dynamics and show that they can achieve the global optimal solutions with convergence rate guarantees. It is shown that optimizatoin in the mean field regime yields adaptivity and statistical superiority of deep learning compared with linear estimators.

Further Information

Venue:: ESI Boltzmann Lecture Hall
Recordings:: Recording
Associated Event:: Computational Uncertainty Quantification: Mathematical Foundations, Methodology & Data (Thematic Programme)
Organizer(s):: Clemens Heitzinger (TU Vienna)
Fabio Nobile (EPFL, Lausanne)
Robert Scheichl (U Heidelberg)
Christoph Schwab (ETH Zurich)
Sara van de Geer (ETH Zurich)
Karen Willcox (U of Texas, Austin)