Hands-On Learning Theory

Semester: 2025 Semester 2, also offered on 2025 Semester 1, 2024 Semester 2 and 2024 Semester 1
Time and place: The University of Melbourne, Monday 2.30-5pm, Melbourne Connect 5210
(except Monday October 13 10.30am-1pm, Melbourne Connect 3101)
Organizer: Jean Honorio

Imagine you run your favorite machine learning algorithm and obtain impressive results. Some weeks later, you collect more data and your algorithm does not perform very well. What happened? Is your algorithm wrong? Do you need to collect more data for learning? Should you use another algorithm?

In these meetings, we will analyze when an algorithm works or not, with respect to several aspects, such as how much training data is needed, how much computation, etc. We will learn the main tools for proving theorems and proposing new algorithms, proving their convergence, necessary/sufficient number of samples, etc.

Technically speaking, these meetings will mainly focus on non-asymptotic analysis of the convergence and statistical efficiency of algorithms. We will introduce several concepts and proof techniques from statistical learning theory, information theory and optimization. The meetings will include topics such as concentration bounds, empirical risk minimization, PAC-Bayes, Rademacher/Gaussian complexity, Karush-Kuhn-Tucker conditions, primal-dual witness, convergence rates, restricted strong convexity, Fano's inequality, etc.

Basic knowledge from calculus and linear algebra is required. Some knowledge or experience with machine learning or data mining is welcome. Necessary concepts from statistics and optimization will be provided during the meetings.

Schedule

Date Topic (Tentative) Notes
Mon, Sep 15 Lecture 1: Markov's inequality, Chebyshev's inequality
Lecture 2: Hoeffding's inequality, empirical risk minimization with a finite hypothesis class
Mon, Sep 22 Lecture 3: Fano's inequality, empirical risk minimization with a finite hypothesis class
    (lecture continues)
Mon, Sep 29 Lecture 4: probably approximately correct (PAC) Bayes, structured prediction
    (lecture continues)
Mon, Oct 6 Lecture 5: McDiarmid's inequality, sub-Gaussian random variables
Lecture 11: Le Cam's lemma
Mon, Oct 13 Lecture 6: Rademacher complexity, linear prediction
    (lecture continues)
Mon, Oct 20 Lecture 7: deterministic and stochastic optimization, convergence rates
    (lecture continues)
Mon, Oct 27 Lecture 8: restricted strong convexity, compressed sensing
    (lecture continues)
Mon, Nov 3 Lecture 9: primal-dual witness method, support recovery
    (lecture continues)
Mon, Nov 10 Lecture 10: growth function, Vapnik-Chervonenkis (VC) dimension, Sauer-Shelah lemma, Massart lemma
    (lecture continues)