a logo for the course

Computational Methods in Optimization

David Gleich

Purdue University

Spring 2023

Course number CS-52000

Monday, Wednesday, Friday, 1:30-2:20pm

Location Lawson 1106


Readings and topics

References

Nonlinear optimization
Algorithms for Optimization by Mykel Kochenderfer and Tim A. Wheeler. MIT Press, 2019.
Numerical methods in optimization by
Jorge Nocedal and Stephen J. Wright.
Springer 2006.
Primal-dual interior-point methods by Stephen J. Wright, SIAM 1997.
Linear and Nonlinear Optimization. Igor Griva, Stephen G. Nash, Ariela Sofer SIAM, 2009.
Least squares
Numerical methods for least squares problems by Åke Björck, SIAM 1996.
Convex
Convex optimization by Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.

Week 16 (April 24-April 30)

Global optimization

Video
Sampling and Global Optimization
BFGS Convergence
Notes
Quasi-Newton Convergence
Slides
Global Optimization

Quadratic Programming

Readings
Nocedal and Wright Chapter 16
Notes
Lecture 42
Julia
Lecture 42 (Quadratic programming) (html) (ipynb)
Video
Quadratic Programming

Week 15 (April 17-April 23)

Large-scale optimization.

Also, alternating algorithms and stochastic gradient descent.

Reading
Large scale optimization handout
Nocedal and Wright Chapter 7
Notes
Lecture 40
Slides
Stochastic separable slides
Julia
Lecture 39 (Stochastic) (html) (ipynb) (See also stochastic notes below)
Video
Trust Region Methods

Finish up trust region methods

Finish up trust region methods and then dive into large scale optimization.

Large-scale optimization: conjugate gradient, simple algorithms, (we didn't get to...) scalable linear algebra, and limited memory quasi-newton.

Readings
Nocedal and Wright Chapter 5
Nocedal and Wright Chapter 7
Griva Sofer Nash, Chapter 13
Conjugate gradient handout
Large scale optimization handout
Notes
Lecture 39
Video
Trust Region Methods

Week 14 (April 10-April 16)

Trust region methods

Readings
Nocedal and Wright Chapter 4
Trust region methods
Notes
Lecture 37
Julia
Lecture 37 (Trust region) (html) (ipynb)
Video
Trust Region Methods

Nonlinear equations

Readings
Nocedal and Wright Chapter 11
Solving Nonlinear equations with Newton's method
by C. T. Kelley
Codes for Solving Nonlinear equations with Newton's method
by C. T. Kelley
Nocedal and Wright Chapter 5
Videos
Checking gradients
Sherman Morrison Woodbury Formula
Nonlinear equations
Homotopy Methods

Week 13 (April 3-April 9)

Derivative Free Optimization

Overview of derivative free optimization and nonlinear programming nonlinear programming

Derivative free optimization: gradient methods, pattern search, Nelder-Mead And Penalty and Augmented Lagrangian methods.

Reading
Nocedal and Wright Chapter 17, 18,
Griva, Sofer, Nash, Chapter 16.
Nonlinear programming slides
Nocedal and Wright Chapter 9
Derivative-free optimization
Julia
Lecture 34 (Derivative free) (html) (ipynb)
Videos
Derivative-free optimization

Quasi-newton methods

Readings
Nocedal and Wright Chapter 6
Quasi-newton methods
Notes
Lecture 33
Videos
Quasi-Newton Methods

Lecture 32

A Practical treatment of Newton's method, Cholesky and Modified Cholesky, Zoutendijk's condition

Readings
Nocedal and Wright Chapter 3
More and Sorensen around page 22 for proof that modified Cholesky factorization gives a matrix with a bounded condition number
Julia
Lecture 32 (Modified Cholesky) (html) (ipynb)
Notes
Lecture 32
Video
Lecture 32

Lecture 31

Matrix calculus and finite difference

Finite difference reading
Nocedal and Wright, Section 8.1
Matrix calculus reading
Matrix Calculus for Deep Learning
http://justindomke.wordpress.com/2011/03/24/matrix-calculus/ (which leads to)
https://tminka.github.io/papers/matrix/ (which leads to)
https://tminka.github.io/papers/matrix/minka-matrix.pdf
https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
Older references, many of which have broken :(
http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf (Lots of examples, focus on Machine Learning) now https://web.archive.org/web/20180127130543/http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf
Peder Olsen's talk
https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html
http://mathoverflow.net/questions/5209/notions-of-matrix-differentiation
http://web.archive.org/web/20080830042052/http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf
https://tminka.github.io/papers/matrix/minka-matrix.pdf
http://ccrma.stanford.edu/~dattorro/matrixcalc.pdf
http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf
http://www.colorado.edu/engineering/CAS/courses.d/IFEM.d/IFEM.AppD.d/IFEM.AppD.pdf
http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz now https://web-beta.archive.org/web/20111222131039/http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz
Video
Lecture 31
Notes
Lecture 31
Julia
Lecture 31 Gradient Checking (html) (ipynb)
Lecture 31 Automatic Differentiation (html) (ipynb)

Lecture 30

Continue IP methods (see the centering step in the code below.) And start matrix calculus (see next lecture for refs)

Video
Lecture 30
Julia
Lecture 30 (Centering steps) (html) (ipynb)
Notes
Lecture 30

Lecture 29

Continue IP methods

Readings
Nocedal and Wright Chapter 14
Notes
Lecture 29
Video
Lecture 29

Lecture 28

Review of project and intro to interior point methods including the central path.

Readings
Nocedal and Wright Chapter 14
Julia
Lecture 28 (Interior point) (html) (ipynb)
Notes
Lecture 28
Video
Lecture 28

Lecture 27

An abbreviated lecture where we reviewed the midterm solutions. This isn't posted as it discusses specific solutions to questions. I'm happy to discuss these in office hours if further questions are needed.

Lecture 26

The midterm.

Lecture 25

Review for the Midterm

Video
Lecture 25

Lecture 24

Review for the Midterm

Notes
Lecture 24
Slides
Lecture 24
Video
Lecture 24

Lecture 23

We covered some modern tools for stochastic Newton methods and natural gradient methods. Then we went into review.

Notes
Lecture 23
Video
Lecture 23

Lecture 22

We covered Stochastic gradient descent and a little bit on stochastic Newton.

Notes
Lecture 22
Video
Lecture 22

Lecture 21

We briefly covered the rest of the proof from last class on backtracking line search and argued why Newton's method ought to drive the step-length to 1,

Notes
Lecture 21
Reading
Nocedal and Wright, Chapter 3
Video
Lecture 21

Lecture 21

We (almost) proved that backtracking line search converges.

Reading
Convergence of line search
Notes
Lecture 20
Video
Lecture 20

Lecture 19

We saw the wolfe conditions for line search and proved that a point always satisfies them. Then we introduced backtracking line search.

Reading
Wolfe Conditions
Nocedal and Wright, Lemma 3.1.
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
AlgOpt Section 4.2
Julia
Lecture 19 (Line search) (html) (ipynb)
Notes
Lecture 19
Video
Lecture 19

Lecture 18

We saw an intro to unconstrained optimization and line search methods for optimization.

Reading (next few lectures)
Unconstrained algorithms
Line search algorithms
Line search and the Wolfe conditions
Convergence of line search
Nocedal and Wright, Lemma 3.1.
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
AlgOpt Chapter 4
AlgOpt Chapter 5
AlgOpt Chapter 6
Reading (this lecture)
Line search algorithms
Unconstrained algorithms
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
Notes
Lecture 18
Video
Lecture 18

Lecture 17

We finished up the proof that the simplex method decreases the objective and then saw Phase 1 / Crash.

This was it for the SIMPLEX method!

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Simple Simplex Code
AlgOpt - Chapter 11 Simplex Method
Julia
Lecture 17 (Simplex) (html) (ipynb)
Notes
Lecture 17
Video
Lecture 17

Lecture 16

We finished up the simplex method and went over how it works by moving from vertex to vertex.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Simple Simplex Code
AlgOpt - Chapter 11 Simplex Method
Julia
Lecture 16 (Simplex) (html) (ipynb)
Notes
Lecture 16
Video
Lecture 16

Lecture 15

We finished the fundamental theorem of linear programming and got into the characterization of a vertex and a basic feasible point.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
AlgOpt - Chapter 11 Simplex Method
Julia
Lecture 15 (Enumeration) (html) (ipynb)
Notes
Lecture 15
Video
Lecture 15

Lecture 14

We saw optimality conditions for linear programs and the dual. We explained the feasible polytope and defined a vertex. We stated the fundamental theorem of linear programming, that if a solution exists, then there is a solution at a vertex of the feasible polytope.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Polytope Geometry
LP optimality
AlgOpt - 11.2.1
Julia
Lecture 14 (LP Feasible Polytopes) (html) (ipynb)
Notes
Lecture 14
Video
Lecture 14

Lecture 13

We saw codes for steepest descent and how it behaves. Then we started into LPs and saw how to convert an LP into standard form.

We saw optimality conditions for linear programs and the dual.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
LP optimality
AlgOpt - Chapter 11 - what we call standard for is their equality form from 11.1.3
AlgOpt - 11.2.2
Julia
Lecture 13 (Steepest Descent) (html) (ipynb)
Notes
Lecture 13
Video
Lecture 13

Lecture 12

We introduced the steepest descent method and shows that it converges at a linear rate on a quadratic objective.

Reading
Griva, Sofer, Nash Section 14.4
Steepest Descent
Two Remarks on the Kantorovich Inequality, by P. Henrici.
Notes
Lecture 12
Video
Lecture 12

Lecture 11

We saw the necessary conditions for linearly inequality constrained optimization. We also saw the definition of a descent direction and an active set.

Reading
Griva, Sofer, Nash Section 14.4
Nocedal and Wright Chapter 2 (for descent directions)
Chapter 10, 11 in AlgOpt
Notes
Lecture 11
Video
Lecture 11

Lecture 10

We saw the necessary and sufficient conditions for linearly constrained optimization and how they give a system of augmented normal equations for constrained least squares. We also saw how to get an initial point for a null-space method. We also so examples of how to use constrained least squares to rank sports teams.

Reading
Griva Sofer Nash, Chapter 14.
Notes
Lecture 10
Julia
Lecture 10 (Constrained Least Squares) (html) (ipynb)
Video
Lecture 10

Lecture 9

We finished up constrained least squares and got into constrained optimization. This included necessary and sufficient conditions based on the reduced Hessian and reduced Gradient. We also saw how to get an initial point for a null-space method.

Reading
Griva Sofer Nash, Chapter 14.
Notes
Lecture 9
Video
Lecture 9

Lecture 8

We saw convexity and least squares problems as an intro to constrained optimization.

Reading
Björck Chapter 2
Björck Chapter 5
Notes
Lecture 8
Video
Lecture 8

Lecture 7

Multivariate taylor, gradients, Hessians, multivariate optimality conditions.

Notes
Lecture 7
Reading
Section 1.6 in AlgOpt
Nocedal and Wright Section 2.1, you should read Theorem 2.3, 2.4
for yourself, and review the multivariate generalizations.
No minimizer at the origin but a minimizer on all lines a algebraic discussion of the example we saw in class
Discussion on symmetry of Hessian
Wikipedia on symmetric Hessian
Julia
Lecture 7 (Optimality Conditions) (html) (ipynb)
Lecture 7 (Rosenbrock) (html) (ipynb)
Video
Lecture 7

Lecture 6

We did an intro to 1d optimization.

Notes
Lecture 6
Reading
Chapter 2 in Nocedal and Wright
Section 1.5 and 1.6 in AlgOpt
Handouts
unconstrained-intro.pdf
Video
Apologies -- the video had a technical error.

Lecture 5

More on optimization software and how to setup problems for Gurobi and Optim.jl

Notes
Lecture 5
Julia
Lecture 5 (Optim.jl) (html) (ipynb)
Lecture 5 Sudoku (html) (ipynb)
Video
Lecture 5

Lecture 4

We covered a quick intro to optimization software.

Julia
Lecture 4 (Least Squares) (html) (ipynb)
Notes
Lecture 4
Video
Lecture 4

Lecture 2 and Lecture 3

We combined Lecture 2 and Lecture 3 into one because I'm going to be gone on Friday. We went over sequences and convergence after covering more about software and Julia.

Reading
Appendix A.2 in Nocedal and Wright
Julia
Lecture 2 (Rates) (html) (ipynb)
Lecture 2 (Limit Points) (html) (ipynb)
Lecture 2 (Julia Intro) (html) (ipynb)
Notes
Lecture 2 (Combined Lecture 2 and 3 notes)
Video
Lecture 2
Lecture 3

Lecture 1

We reviewed the syllabus, and saw the xkcd raptor problem as a motivation to optimization.

Reading
Syllabus
Slides
Lecture 1
Julia
Lecture 1 (Julia) (html) (ipynb)
Lecture 1 (Raptor) (html) (ipynb)
Video
Lecture 1

In class videos and notes