Semester:  Spring 2021, also offered on Fall 2019 and Fall 2018 
Time and place:  Monday, Wednesday and Friday, 10.30am11.20am EST 
Instructor:  Jean Honorio, Lawson Building 2142J (Please send an email for appointments) 
TAs: 
Kevin Bello, email: kbellome at purdue.edu, Office hours: Monday 1pm3pm EST Prerit Gupta, email: gupta596 at purdue.edu, Office hours: Friday 2pm4pm EST Chuyang Ke, email: cke at purdue.edu, Office hours: Tuesday 2pm4pm EST Jin Son, email: son74 at purdue.edu, Office hours: Thursday 3pm5pm EST Anxhelo Xhebraj, email: axhebraj at purdue.edu, Office hours: Wednesday noon2pm EST Kaiyuan Zhang, email: zhan4057 at purdue.edu, Office hours: Tuesday 10amnoon EST 
Date  Topic (Tentative)  Notes 
Wed, Jan 20  Lecture 1: introduction  Python 
Fri, Jan 22  Lecture 2: probability review (joint, marginal and conditional probabilities)  
Mon, Jan 25 
(lecture continues) Lecture 3: statistics review (independence, maximum likelihood estimation) 

Wed, Jan 27  (lecture continues)  
Fri, Jan 29  Lecture 4: linear algebra review 
Linear algebra in Python Homework 1: due on Feb 5, 11.59pm EST 
Mon, Feb 1  Lecture 5: elements of data mining and machine learning algorithms  
Wed, Feb 3 
(lecture continues) Lecture 6: linear classification, perceptron 

Fri, Feb 5  —  Homework 1 due 
Mon, Feb 8  (lecture continues)  
Wed, Feb 10 
(lecture continues) Lecture 7: perceptron (convergence), support vector machines (introduction) 

Fri, Feb 12  (lecture continues)  Homework 2: due on Feb 19, 11.59pm EST 
Mon, Feb 15  (lecture continues)  
Wed, Feb 17  READING DAY  
Fri, Feb 19  Lecture 8: generative probabilistic modeling, maximum likelihood estimation, classification  Homework 2 due 
Mon, Feb 22 
(lecture continues) Lecture 9: generative probabilistic classification (naive Bayes), nonparametric methods (nearest neighbors) 
Homework 3: due on Mar 1, 11.59pm EST 
Web, Feb 24 
(lecture continues) Lecture 10: nonparametric methods (classification trees) 

Fri, Feb 26  (lecture continues)  
Mon, Mar 1  Case Study 1  Homework 3 due 
Wed, Mar 3 
(lecture continues) Lecture 11: performance measures, crossvalidation, statistical hypothesis testing 

Fri, Mar 5 
(lecture continues) Lecture 12: model selection and generalization (VC dimension) 
Homework 4: due on Mar 12, 11.59pm EST 
Mon, Mar 8  (lecture continues)  
Wed, Mar 10  Case Study 2  
Fri, Mar 12 
(lecture continues) Lecture 13: dimensionality reduction, principal component analysis (PCA) 
Homework 4 due 
Mon, Mar 15  (lecture continues)  
Wed, Mar 17  MIDTERM (lectures 1 to 12, all case studies) 
Start: Wednesday March 17, 10.30am EST End: Thursday March 18, 10.30am EST 
Fri, Mar 19 
(lecture continues) (midterm solution) 
Homework 5: due on Mar 26, 11.59pm EST 
Mon, Mar 22  Lecture 14: nonlinear feature mappings, kernels, kernel perceptron, kernel support vector machines  
Wed, Mar 24  (lecture continues)  
Fri, Mar 26  Lecture 15: ensemble methods: bagging, boosting, bias/variance tradeoff 
Homework 5 due Homework 6: due on Apr 2, 11.59pm EST 
Mon, Mar 29 
(lecture continues) Case Study 3 
Project plan due ([Word] or [Latex] format) 
Wed, Mar 31  (lecture continues)  
Fri, Apr 2  —  Homework 6 due 
Mon, Apr 5  Lecture 16: clustering, kmeans, hierarchical clustering  Homework 7: due on Apr 12, 11.59pm EST 
Wed, Apr 7 
(lecture continues) Lecture 17: clustering, mixture models, expectationmaximization (EM) algorithm 

Fri, Apr 9 
(lecture continues) Lecture 18: anomaly detection, oneclass support vector machines 

Mon, Apr 12  (lecture continues)  Homework 7 due 
Wed, Apr 14  Lecture 19: Bayesian networks (independence)  
Fri, Apr 16 
(lecture continues) Lecture 20: pattern discovery, association rules, frequent itemsets 
Preliminary project report, due on Apr 16, 11.59pm EST 
Mon, Apr 19 
(lecture continues) Lecture 21: feature selection (univariate/multivariate, filter/wrapper/embedded methods, L1norm regularization) 

Wed, Apr 21  (lecture continues)  
Fri, Apr 23 
(lecture continues) Lecture 22: data quality, preprocessing, visualization, distances 

Mon, Apr 26  FINAL EXAM (lectures 13 to 21, all case studies) 
Start: Monday April 26, 10.30am EST End: Tuesday April 27, 10.30am EST 
Wed, Apr 28  (lecture continues)  
Fri, Apr 30  (final exam solution)  Final project report, due on Apr 30, 11.59pm EST 