CS 57800: Statistical Machine Learning

Semester: Fall 2016, also offered on Fall 2020, Spring 2020, Spring 2018 and Fall 2017
Time and place: Tuesday and Thursday, 3.00pm-4.15pm, Seng-Liang Wang Hall 2599
Instructor: Jean Honorio, Lawson Building 2142-J (Please send an e-mail for appointments)
TAs: Chang Li, e-mail: li1873 at purdue.edu, Office hours: Monday, 11am-1pm, HAAS G50
Rohit Rangan, e-mail: rrangan at purdue.edu, Office hours: Friday, 3pm-5pm, HAAS G50

Machine learning offers a new paradigm of computing — computer systems that can learn to perform tasks by finding patterns in data, rather than by running code specifically written to accomplish the task by a human programmer. The most common machine-learning scenario requires a human teacher to annotate data (identify relevant phenomenon that occurs in the data), and use a machine-learning algorithm to generalize from these examples. Generalization is at the heart of machine learning — how can the machine go beyond the provided set of examples and make predictions about new data. In this class we will look into different machine learning scenarios, look into several algorithms analyze their performance and learn the theory behind them.

A tentative list of topics in supervised learning include: linear and non-linear classifiers, kernels, rating, ranking, collaborative filtering, model selection, complexity, generalization, structured prediction. A tentative list of topics in unsupervised learning and modeling include: mixture models, Bayesian networks, Markov random fields, factor graphs.

Learning Objectives

During the course, students will:

Prerequisites

This class requires some mathematical background. It's not a math class, however you should be comfortable with linear algebra, calculus, statistics and probability. Programming knowledge is also required.

Textbooks

There is no official text book for this class. I will post slides and pointers to reading materials. Recommended books for further reading include (* freely available online):

* The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman.
* Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David.
* A Course in Machine Learning by Hal Daumé III.
Pattern Classification, 2nd Edition by Richard O. Duda, Peter E. Hart, David G. Stork.
Pattern Recognition and Machine Learning by Christopher M. Bishop.
Machine Learning by Tom Mitchell.
Probabilistic Graphical Models by Daphne Koller and Nir Friedman.

Assignments

There will be up to five homeworks, one midterm exam, one final exam and one project (dates posted on the schedule). The homeworks are to be done individually and in MATLAB. The project can be done either individually or in groups of up to 3 students. (More people implies a higher expectation.)

For the project, you will write a half-page project plan (around 1-2 weeks before the midterm), a 2-4 page preliminary results report (around 1-2 weeks after the midterm) and a 4-8 page final results report (around 1-2 weeks before the final exam). The project should include: Neither I nor the TAs will provide any help regarding programming-related issues.

Grading

Homeworks: 25%
Midterm exam: 25%
Final exam: 25%
Project: 25%

Late policy

Assignments are to be submitted by the due date listed. Each person will be allowed seven days of extensions which can be applied to any combination of assignments during the semester. Use of a partial day will be counted as a full day. Extensions cannot be used after the final day of classes. Please, use the extension days wisely!

Assignments will NOT BE accepted if they are more than five days late.

Academic Honesty

Please read the departmental academic integrity policy here. This will be followed unless we provide written documentation of exceptions. We encourage you to interact amongst yourselves: you may discuss and obtain help with basic concepts covered in lectures and homework specification (but not solution). However, unless otherwise noted, work turned in should reflect your own efforts and knowledge. Sharing or copying solutions is unacceptable and could result in failure. You are expected to take reasonable precautions to prevent others from using your work.

Additional course policies

Please read the general course policies here.

Schedule

Date Topic (Tentative) Notes
Tue, Aug 23 Lecture 1: perceptron (introduction) Homework 0: due on Aug 25 at beginning of class - NO EXTENSION DAYS ALLOWED
Thu, Aug 25 Lecture 2: perceptron (convergence), max-margin classifiers, support vector machines (introduction) Homework 0 due - NO EXTENSION DAYS ALLOWED
Tue, Aug 30 Lecture 3: nonlinear feature mappings, kernels (introduction), kernel perceptron Homework 0 solution
Thu, Sep 1 Lecture 4: SVM with kernels, dual solution Homework 1: due on Sep 8, 11.59pm EST
Tue, Sep 6 Lecture 5: one-class problems (anomaly detection), one-class SVM, multi-way classification, direct multi-class SVM
Refs: [1] [2] [3] [4] (not mandatory to be read)
Thu, Sep 8 Lecture 6: rating (ordinal regression), PRank, ranking, rank SVM
Refs: [1] (not mandatory to be read)
Homework 1 due
Tue, Sep 13 Lecture 7: linear and kernel regression, feature selection (information ranking, regularization, subset selection)
Thu, Sep 15 Lecture 8: ensembles and boosting Homework 2: due on Sep 27, 11.59pm EST
Tue, Sep 20
Thu, Sep 22
Tue, Sep 27 Lecture 9: model selection (finite hypothesis class)
Refs: [1] (not mandatory to be read)
Homework 2 due
Thu, Sep 29 Lecture 10: model selection (growth function, VC dimension, PAC Bayesian bounds)
Notes: [1]
Project plan due (see Assignments for details)
Tue, Oct 4 Lecture 11: generative probabilistic modeling, maximum likelihood estimation, mixture models, EM algorithm (introduction)
Notes: [1]
Thu, Oct 6 Lecture 12: mixture models, EM algorithm, convergence, model selection
Notes: [1]
Tue, Oct 11 OCTOBER BREAK
Thu, Oct 13 Lecture 13: active learning, kernel regression, Gaussian processes
Refs: [1] (not mandatory to be read)
Tue, Oct 18 MIDTERM 3.00pm-4.15pm at Seng-Liang Wang Hall 2599
Thu, Oct 20     (midterm solution)
Tue, Oct 25 Lecture 14: collaborative filtering (matrix factorization), structured prediction (max-margin approach)
Refs: [1] (not mandatory to be read)
Thu, Oct 27     (lecture continues)
Tue, Nov 1 Lecture 15: performance measures, cross-validation, bias-variance tradeoff, statistical hypothesis testing Preliminary project report due (see Assignments for details)
Thu, Nov 3
Tue, Nov 8 Lecture 16: dimensionality reduction, principal component analysis (PCA), kernel PCA
Thu, Nov 10
Tue, Nov 15 Lecture 17: Bayesian networks (motivation, examples, graph, independence)
Refs: [1] [2] (not mandatory to be read)
Thu, Nov 17 Lecture 18: Bayesian networks (independence, equivalence, learning)
Refs: [1] [2] [3, chapters 16-20] (not mandatory to be read)
Homework 3: due on Nov 22, 11.59pm EST
Tue, Nov 22 Homework 3 due
Thu, Nov 24 THANKSGIVING VACATION
Tue, Nov 29 Lecture 19: Bayesian networks (introduction to inference), Markov random fields, factor graphs
Refs: [1] [2] (not mandatory to be read)
Thu, Dec 1 Lecture 20: Markov random fields (inference, learning)
Refs: [1] [2] [3, chapters 16-20] (not mandatory to be read)
Final project report due (see Assignments for details)
Tue, Dec 6 Lecture 21: Markov random fields (inference in general graphs, junction trees) Not mandatory, extra Homework 4 posted on Kaggle
Thu, Dec 8
Wed, Dec 14 FINAL EXAM 8.00am-9.30am at PHYS 223