STAT 59800-HLT: Hands-On Learning Theory

Semester:	Fall 2020, also offered on Fall 2021, Fall 2019, Fall 2018, Fall 2017, Fall 2016 and Fall 2015
Time and place:	Tuesday and Thursday, 4.30pm-5.45pm EST
Instructor:	Jean Honorio
Office hours:	(Please send an e-mail for appointments)

Imagine you run your favorite machine learning algorithm and obtain impressive results. Some weeks later, you collect more data and your algorithm does not perform very well. What happened? Is your algorithm wrong? Do you need to collect more data for learning? Should you use another algorithm?

In this course, we will analyze when an algorithm works or not, with respect to several aspects, such as how much training data is needed, how much computation, etc. We will learn the main tools for proving theorems and proposing new algorithms, proving their convergence, necessary/sufficient number of samples, etc.

Technically speaking, this course will mainly focus on non-asymptotic analysis of the convergence and statistical efficiency of algorithms. We will introduce several concepts and proof techniques from statistical learning theory, information theory and optimization. The course will include topics such as concentration bounds, empirical risk minimization, PAC-Bayes, Rademacher/Gaussian complexity, Karush-Kuhn-Tucker conditions, primal-dual witness, convergence rates, restricted strong convexity, Fano's inequality, etc.

In a typical course, students work on projects for which the answer is known. Students in Hands-On Learning Theory work in projects for which there is not a known answer. Hands-On Learning Theory requires careful involvement in the project selection, and several meetings (outside the lecture times) in order to bring the projects to a level that is ready for conference submission (with some few extra work after the semester ends). Hands-On Learning Theory has already produced 10 papers accepted in some of the top conferences, and 9 papers currently under submission.

Learning Objectives

Upon completing the course, students should be able to use different techniques for:

Proving upper and lower bounds on the sample complexity of a learning problem.
Proving convergence rates of a continuous optimization problem.

Prerequisites

Basic knowledge from calculus and linear algebra is required. Some knowledge or experience with machine learning or data mining is welcome. Necessary concepts from statistics and optimization will be provided during the course.

Registration is by instructor's permission.

Assignments

There will be four homeworks, one paper review and one project (dates posted on the schedule). All assignments (homeworks, paper review and project) are to be done individually.

The homeworks will be on problems for which the answer is known. The project will be on a problem for which the answer is not known or only partially known.

For the paper review and the project, please pick a relevant topic according to your study goals. I encourage you to pick a paper and project on the same/similar topic. In case you do not have a topic, I can potentially assign you one. I encourage you to pick your topic as soon as possible.

For the paper review, you will write a 2-4 page review of a paper (around 1-2 weeks before the midterms). The goal is to learn to read technically demanding papers critically, and hopefully in the process, generate novel research ideas. Your review should not only summarize the main result of the paper, but critique it, instantiate it on examples, discuss its overall significance, and suggest possible future directions.

For the project, you will write a half-page project plan (in the beginning of the semester), a 2-4 page preliminary results report (around 1-2 weeks after the midterms) and a 4-8 page final results report (around the final exam date). The goal of the project is to apply some of the techniques learnt in class to a particular machine learning problem. Remember, the goal of the course is to learn how to prove theorems regarding sample complexity. Proposing a small modification to a current algorithm that allows proving aspects of the algorithm's behavior is also welcome.

Grading

Homeworks: 30%
Paper review: 20%
Preliminary project report: 20%
Final project report: 30%

Late policy

Assignments are to be submitted by the due date listed. Each person will be allowed seven days of extensions which can be applied to any combination of assignments during the semester. Use of a partial day will be counted as a full day. Extensions cannot be used after the final day of classes. Please, use the extension days wisely!

Assignments will NOT BE accepted if they are more than five days late.

Academic Honesty

Please read the departmental academic integrity policy here. This will be followed unless we provide written documentation of exceptions. We encourage you to interact amongst yourselves: you may discuss and obtain help with basic concepts covered in lectures and homework specification (but not solution). However, unless otherwise noted, work turned in should reflect your own efforts and knowledge. Sharing or copying solutions is unacceptable and could result in failure. You are expected to take reasonable precautions to prevent others from using your work.

Additional course policies

Please read the general course policies here.

Schedule

Date	Topic (Tentative)	Notes
Tue, Aug 25	Lecture 1: Markov's inequality, Chebyshev's inequality	Homework 0: due on Aug 27, 11.59pm EST - NO EXTENSION DAYS ALLOWED
Thu, Aug 27	Lecture 2: Hoeffding's inequality, empirical risk minimization with a finite hypothesis class	Homework 0 due - NO EXTENSION DAYS ALLOWED
Tue, Sep 1	Lecture 3: Fano's inequality, empirical risk minimization with a finite hypothesis class	Homework 0 solution
Thu, Sep 3	(lecture continues)	Homework 1: due on Sep 10, 11.59pm EST
Tue, Sep 8	Lecture 4: probably approximately correct (PAC) Bayes, structured prediction
Thu, Sep 10	(lecture continues)	Homework 1 due
Tue, Sep 15	Lecture 5: McDiarmid's inequality, sub-Gaussian random variables	Homework 1 solution Homework 2: due on Sep 22, 11.59pm EST
Thu, Sep 17	—
Tue, Sep 22	Lecture 9: primal-dual witness method, support recovery	Homework 2 due
Thu, Sep 24	(lecture continues)
Tue, Sep 29	Lecture 6: Rademacher complexity, linear prediction	Homework 2 solution
Thu, Oct 1	(lecture continues)	Send me an e-mail with a link to the paper you plan to review
Tue, Oct 6	—	Homework 3: due on Oct 13, 11.59pm EST
Thu, Oct 8	Lecture 8: restricted strong convexity, compressed sensing
Tue, Oct 13	(lecture continues)	Homework 3 due
Thu, Oct 15	—	Project plan due (see Assignments for details)
Tue, Oct 20	Lecture 7: deterministic and stochastic optimization, convergence rates	Homework 3 solution
Thu, Oct 22	(lecture continues)	Paper review due (see Assignments for details)
Tue, Oct 27	Paper presentations and project goals (20-25 minutes per person)
Thu, Oct 29	(presentations continue)
Tue, Nov 3	Lecture 11: Le Cam's lemma
Thu, Nov 5	(lecture continues)	Preliminary project report due (see Assignments for details)
Tue, Nov 10	Lecture 10: growth function, Vapnik-Chervonenkis (VC) dimension, Sauer-Shelah lemma, Massart lemma
Thu, Nov 12	(lecture continues)
Tue, Nov 17	Project presentations (30-35 minutes per person)
Thu, Nov 19	(presentations continue)
Tue, Nov 24	(presentations continue)
Thu, Nov 26	THANKSGIVING VACATION
Thu, Dec 3	—	Final project report due (see Assignments for details)