Purdue University - Department of Computer Science - CS 37300 Data Mining and Machine Learning
Skip to main content

CS 37300 Data Mining and Machine Learning

Course Description

This course will introduce students to the field of data mining and machine learning, which sits at the interface between statistics and computer science. Data mining and machine learning focuses on developing algorithms to automatically discover patterns and learn models of large datasets. This course introduces students to the process and main techniques in data mining and machine learning, including exploratory data analysis, predictive modeling, descriptive modeling, and evaluation.

Course Outline

Introduction (1 week)

What is data mining? What is machine learning? Overview of the process and associated tasks. Example applications.

 

Background and basics (1 week)

Types of data: attributes, instances. Populations and samples. Random variables and distributions. R and Python.

 

Exploratory data analysis (2 weeks)

Data cleaning and preprocessing. Sampling. Feature construction and discovery. Visualization methods. Hypothesis testing.

 

Predictive Modeling (3 weeks)

Classification problem formulation. Algorithmic elements: representation, scoring functions, search, inference. Overview of basic algorithms (e.g., naive Bayes, decision trees, nearest neighbor). Evaluation: metrics, cross-validation, learning curves.

 

Understanding and Extending Model Performance (1 week)

Error analysis. Feature selection. Ensemble techniques.

 

Descriptive Modeling (3 weeks)

Clustering problem formulation. Algorithmic elements: representation, scoring functions, search, inference. Overview of basic algorithms (e.g., k-means, hierarchical clustering, spectral clustering). Evaluation: metrics, subjective assessment.

 

Pattern Mining (2 weeks)

Pattern detection formulation. Algorithmic elements: representation, scoring functions, search, inference. Overview of basic algorithms (e.g., association rules, anomaly detection). Evaluation: metrics, interestingness, understandability.

Last Updated: Feb 15, 2019 4:20 PM

Department of Computer Science, 305 N. University Street, West Lafayette, IN 47907

Phone: (765) 494-6010 • Fax: (765) 494-0739

Copyright © 2020 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.