Hands-On Learning Theory

Semester: 2025 Semester 2, also offered on 2025 Semester 1, 2024 Semester 2 and 2024 Semester 1
Time and place: The University of Melbourne, Monday 2.30-5pm, Melbourne Connect 5210
(except Monday October 13 10.30am-1pm, Melbourne Connect 3101)
Organizer: Jean Honorio

Imagine you run your favorite machine learning algorithm and obtain impressive results. Some weeks later, you collect more data and your algorithm does not perform very well. What happened? Is your algorithm wrong? Do you need to collect more data for learning? Should you use another algorithm?

In these meetings, we will analyze when an algorithm works or not, with respect to several aspects, such as how much training data is needed, how much computation, etc. We will learn the main tools for proving theorems and proposing new algorithms, proving their convergence, necessary/sufficient number of samples, etc.

Technically speaking, these meetings will mainly focus on non-asymptotic analysis of the convergence and statistical efficiency of algorithms. We will introduce several concepts and proof techniques from statistical learning theory, information theory and optimization. The meetings will include topics such as concentration bounds, empirical risk minimization, PAC-Bayes, Rademacher/Gaussian complexity, Karush-Kuhn-Tucker conditions, primal-dual witness, convergence rates, restricted strong convexity, Fano's inequality, etc.

Basic knowledge from calculus and linear algebra is required. Some knowledge or experience with machine learning or data mining is welcome. Necessary concepts from statistics and optimization will be provided during the meetings.

Schedule

Date Topic (Tentative) Notes
Mon, Sep 15 Lecture 1: Markov's inequality, Chebyshev's inequality
Lecture 2: Hoeffding's inequality, empirical risk minimization with a finite hypothesis class
Mon, Sep 22 Lecture 3: Fano's inequality, empirical risk minimization with a finite hypothesis class
    (lecture continues)
Mon, Sep 29 Lecture 4: probably approximately correct (PAC) Bayes, structured prediction
    (lecture continues)
Mon, Oct 6 Lecture 5: McDiarmid's inequality, sub-Gaussian random variables
Lecture 6: Rademacher complexity, linear prediction
Mon, Oct 13     (lecture continues)
Lecture 7: deterministic and stochastic optimization, convergence rates
Mon, Oct 20     (lecture continues)
Lecture 8: restricted strong convexity, compressed sensing
Mon, Oct 27     (lecture continues)
Lecture 9: primal-dual witness method, support recovery
Mon, Nov 3     (lecture continues)
Lecture 10: growth function, Vapnik-Chervonenkis (VC) dimension, Sauer-Shelah lemma, Massart lemma
Mon, Nov 10     (lecture continues)
Lecture 11: Le Cam's lemma