CS 57700: Natural Language Processing - Department of Computer Science - Purdue University Skip to main content

CS 57700: Natural Language Processing

Course Description:

This course will cover the key concepts and methods used in modern Natural Language Processing (NLP). Throughout the course several core NLP tasks, such as sentiment analysis, information extraction, syntactic and semantic analysis, will be discussed. The course will emphasize machine-learning and data-driven algorithms and techniques, and will compare several different approaches to these problems in terms of their performance, supervision effort and computational complexity.

Course Outline:

Introduction (1 week)

What is natural language processing? Overview of natural language processing applications, computational linguistics and machine learning.

Language modeling (1 week)

Probability review. Language modeling, smoothing, evaluation. Applications of Language models.

 

Text classification (2 weeks)

Generative and discriminative classification models.  Naïve Bayes, perceptron, log-linear models, large margin classification, multiclass classification, ranking, hierarchical classification.  Applications: sentiment analysis, text categorization.

 

Introduction to Sequence prediction (2 weeks)

Hidden Markov models, the Viterbi algorithm. Discriminative models for sequence prediction. Local vs. global training protocols (MEMM vs. CRF).  Applications: part-of-speech tagging, chunking.

 

Beyond Sequence Prediction (2 week)

Unified view of all the models as loss minimization.  Global inference using Integer Linear Programming. Overview of inference in graphical models.  Latent variable models. Applications: Semantic Role Labeling, Textual Entailment, Co-reference resolution.

 

Syntax and Semantics (3 weeks)

 In depth review of algorithms used for solving various problems in syntactic and semantic analysis.  Constituency parsing using the CYK algorithm, local and global models for dependency parsing, Abstract Meaning Representation (AMR), relation extraction and semantic parsing.

 

Deep Learning for NLP (3 weeks)

Distributed representation (word embedding), classification using feed-forward and convolutional networks. Using deep learning for structured data: recurrent and recursive networks.

 

Current Research Topics (2 weeks)

Last Updated: Feb 15, 2019 5:11 PM

Department of Computer Science, 305 N. University Street, West Lafayette, IN 47907

Phone: (765) 494-6010 • Fax: (765) 494-0739

Copyright © 2024 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.