a logo for the course

Computational Methods in Optimization

David Gleich

Purdue University

Spring 2025

Course number CS-52000

Tuesday, Thursday 10:30-11:45

Location Lawson 1106


Readings and topics

References

Nonlinear optimization
Algorithms for Optimization by Mykel Kochenderfer and Tim A. Wheeler. MIT Press, 2019.
Numerical methods in optimization by
Jorge Nocedal and Stephen J. Wright.
Springer 2006.
Primal-dual interior-point methods by Stephen J. Wright, SIAM 1997.
Linear and Nonlinear Optimization. Igor Griva, Stephen G. Nash, Ariela Sofer SIAM, 2009.
Least squares
Numerical methods for least squares problems by Åke Björck, SIAM 1996.
Convex
Convex optimization by Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.

Lecture 28

Large-scale optimization: conjugate gradient, simple algorithms, scalable linear algebra

Readings
Nocedal and Wright Chapter 5
Nocedal and Wright Chapter 7
Griva Sofer Nash, Chapter 13
Conjugate gradient handout
Large scale optimization handout
Notes
Lecture 28
Julia
Lecture 28 (Sparse Direct Methods) (html) (ipynb)
Video
Lecture 28

Lecture 27

Dive into large scale optimization.

Large-scale optimization: limited memory quasi-newton.

Readings
Nocedal and Wright Chapter 5
Nocedal and Wright Chapter 7
Griva Sofer Nash, Chapter 13
Large scale optimization handout
Notes
Lecture 27
Video
Lecture 27

Lecture 26

Trust region methods

Readings
Nocedal and Wright Chapter 4
Trust region methods
Notes
Lecture 26
Julia
Lecture 26 (Trust region) (html) (ipynb)
Video
Lecture 26

Lecture 25

Nonlinear equations and Sherman-Morrison-Woodbury

Readings
Nocedal and Wright Chapter 11
Solving Nonlinear equations with Newton's method
by C. T. Kelley
Codes for Solving Nonlinear equations with Newton's method
by C. T. Kelley
Nocedal and Wright Chapter 5
Notes
Lecture 25
Video
Lecture 25

Lecture 24

Quasi-newton methods

Readings
Nocedal and Wright Chapter 6
Quasi-newton methods
Notes
Lecture 24
Video
Lecture 24

Lecture 23

Overview of derivative free optimization and nonlinear programming nonlinear programming, including

Derivative free optimization: gradient methods, pattern search, Nelder-Mead And Penalty and Augmented Lagrangian methods.

Reading
Nocedal and Wright Chapter 17, 18,
Griva, Sofer, Nash, Chapter 16.
Nocedal and Wright Chapter 9
AlgOpt, Chapter 7
AlgOpt, Chapter 10
Julia
Lecture 23 (Derivative free) (html) (ipynb)
Notes
Lecture 23
Video
Lecture 23

Lecture 22

A Practical treatment of Newton's method, Cholesky and Modified Cholesky, Zoutendijk's condition

Readings
Nocedal and Wright Chapter 3
Julia
Lecture 22 (Modified Cholesky) (html) (ipynb)
Notes
Lecture 22
Video
Lecture 22

Lecture 21

Matrix calculus and finite differences

Finite difference reading
Nocedal and Wright, Section 8.1
Matrix calculus reading
Matrix Calculus for Deep Learning
http://justindomke.wordpress.com/2011/03/24/matrix-calculus/ (which leads to)
https://tminka.github.io/papers/matrix/ (which leads to)
https://tminka.github.io/papers/matrix/minka-matrix.pdf
https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
Older references, many of which have broken :(
http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf (Lots of examples, focus on Machine Learning) now <https://web.archive.org/web/20180127130543/
http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf>
Peder Olsen's talk
https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html
http://mathoverflow.net/questions/5209/notions-of-matrix-differentiation
http://web.archive.org/web/20080830042052/http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf
https://tminka.github.io/papers/matrix/minka-matrix.pdf
http://ccrma.stanford.edu/~dattorro/matrixcalc.pdf
http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf
http://www.colorado.edu/engineering/CAS/courses.d/IFEM.d/IFEM.AppD.d/IFEM.AppD.pdf
http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz now https://web-beta.archive.org/web/20111222131039/http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz
Julia
Lecture 21 (Matrix Calculus) (html) (ipynb)
Lecture 21 (Autodiff) (html) (ipynb)
Notes
Lecture 21
Video
Lecture 21

Lecture 20

Finish up IP methods (see the centering step in the code below.)

Readings
Nocedal and Wright Chapter 14
Julia
Lecture 20 (Centering steps) (html) (ipynb)
Notes
Lecture 20
Video
Lecture 20

Lecture 19

Review of project, midterm and intro to interior point methods including the central path

Readings
Nocedal and Wright Chapter 14
Julia
Lecture 19 (Interior point) (html) (ipynb)
Notes
Lecture 19
Video
Lecture 19

Lecture 18

The midterm

Lecture 17

Review for the Midterm

Slides
Midterm Notes
Video
Lecture 17-short

Lecture 16

We briefly covered the rest of the proof from last class, why Newton's method ought to drive the step-length to 1, and stochastic gradient descent (SGD).

Notes
Lecture 16
Julia
Lecture 16 (SGD) (html) (ipynb)
Video
Lecture 16

Lecture 15

We proved that backtracking line search converges.

Reading
Convergence of line search
Notes
Lecture 15
Video
Lecture 15

Lecture 14

We saw the wolfe conditions for line search.

Reading
Wolfe Conditions
Nocedal and Wright, Lemma 3.1.
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
AlgOpt Section 4.2
Julia
Lecture 14 (Line search) (html) (ipynb)
Notes
Lecture 14
Video
Lecture 14

Lecture 13

We finished up the proof that the simplex method decreases the objective and then saw Phase 1 / Crash.

This was it for the SIMPLEX method! Then we went into unconstrained algorithms. (Except we ran out of time.)

Reading
Simplex code
Line search algorithms
Unconstrained algorithms
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
Notes
Lecture 13
Julia
Lecture 13 (Cycling) (html) (ipynb) Some quick experiments degenerate constraints to try and force cycling.
Video
Lecture 13
Reading (next few lectures)
Unconstrained algorithms
Line search algorithms
Line search and the Wolfe conditions
Convergence of line search
Nocedal and Wright, Lemma 3.1.
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
AlgOpt Chapter 4
AlgOpt Chapter 5
AlgOpt Chapter 6

Lecture 12

We finished up the simplex method and went over how it works by moving from vertex to vertex.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Simple Simplex Code
AlgOpt - Chapter 11 Simplex Method
Julia
Lecture 12 (Simplex) (html) (ipynb)
Lecture 12 (Enumeration) (html) (ipynb)
Notes
Lecture 12
Video
Lecture 12

Lecture 11

We proved the fundamental theorem of linear programming, that if a solution exists, then there is a solution at a vertex of the feasible polytope.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Polytope Geometry
AlgOpt - 11.2.1
Julia
Lecture 11 (LP Feasible Polytopes) (html) (ipynb)
Notes
Lecture 11
Video
Lecture 11

Lecture 10

We saw optimality conditions for linear programs and the dual.

Reading
Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
LP optimality
AlgOpt - Chapter 11 - what we call standard for is their equality form from 11.1.3
AlgOpt - 11.2.2
Julia
Lecture 10 (Linear programming) (html) (ipynb)
Notes
Lecture 10
Video
Lecture 10

Lecture 9

We introduced the steepest descent method and shows that it converges at a linear rate on a quadratic objective.

Reading
Griva, Sofer, Nash Section 14.4
Steepest Descent
Two Remarks on the Kantorovich Inequality, by P. Henrici.
Julia
Lecture 9 (Steepest Descent) (html) (ipynb)
Notes
Lecture 9
Video
Lecture 9

Lecture 8

We saw the necessary and sufficient conditions for linearly inequality constrained optimization. We also saw the definition of a descent direction and an active set.

Reading
Griva, Sofer, Nash Section 14.4
Nocedal and Wright Chapter 2 (for descent directions)
Chapter 10, 11 in AlgOpt
Notes
Lecture 8
Video
Lecture 8

Lecture 7

We saw the necessary and sufficient conditions for linearly constrained optimization and how they give a system of augmented normal equations for constrained least squares. We also saw how to get an initial point for a null-space method.

Reading
Griva Sofer Nash, Chapter 14.
Notes
Lecture 7
Julia
Lecture 7 (Constrained Least Squares) (html) (ipynb)
Video
Lecture 7

Lecture 6

We saw algorithms for constrained least squares problems as an intro to constrained optimization.

Reading
Björck Chapter 2
Björck Chapter 5
Notes
Lecture 6
Video
Lecture 6

Lecture 5

Multivariate taylor, gradients, Hessians, multivariate optimality conditions.

Notes
Lecture 5
Reading
Section 1.6 in AlgOpt
Nocedal and Wright Section 2.1, you should read Theorem 2.3, 2.4
for yourself, and review the multivariate generalizations.
No minimizer at the origin but a minimizer on all lines a algebraic discussion of the example we saw in class
Discussion on symmetry of Hessian
Wikipedia on symmetric Hessian
Julia
Lecture 5 (Optimality Conditions) (html) (ipynb)
Video
Lecture 5

Lecture 4

We did an intro to 1d optimization.

Reading
Chapter 2 in Nocedal and Wright
Section 1.5 and 1.6 in AlgOpt
Notes
Lecture 4
Handouts
unconstrained-intro.pdf
Julia
Lecture 4 (Optim.jl) (html) (ipynb)
Lecture 4 (Flux.jl) (html) (ipynb)
Video
Lecture 4

Lecture 3

We covered a quick intro to optimization software.

Julia
Lecture 3 (Least Squares) (html) (jl) (ipynb)
Lecture 3 (Sudoku) (html) (jl) (ipynb)
Notes
Lecture 3
Video
Lecture 3

Lecture 2

We went over sequences and convergence. No in class notes because the ipad pen failed, see the video

Reading
Appendix A.2 in Nocedal and Wright
Julia
Lecture 2 (Rates) (html) (ipynb)
Lecture 2 (Limit Points) (html) (ipynb)
Lecture 2 (Limit Points) (html) (ipynb)
Video
Lecture 2

Lecture 1

We reviewed the syllabus, and saw the xkcd raptor problem as a motivation to optimization.

Reading
Syllabus
Slides
Lecture 1
Julia
Lecture 1 (Raptor) (html) (ipynb)
Video
Lecture 1