$\newcommand{\eps}{\varepsilon} \newcommand{\kron}{\otimes} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator{\subjectto}{subject to} \newcommand{\mat}[1]{\boldsymbol{#1}} \renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{\vecalt}[1]{\boldsymbol{#1}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\normof}[1]{\|#1\|} \newcommand{\onormof}[2]{\|#1\|_{#2}} \newcommand{\MIN}[2]{\begin{array}{ll} \minimize_{#1} & {#2} \end{array}} \newcommand{\MINone}[3]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\MINthree}[5]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \\ & {#4} \\ & {#5} \end{array}} \newcommand{\MAX}[2]{\begin{array}{ll} \maximize_{#1} & {#2} \end{array}} \newcommand{\MAXone}[3]{\begin{array}{ll} \maximize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\itr}[2]{#1^{(#2)}} \newcommand{\itn}[1]{^{(#1)}} \newcommand{\prob}{\mathbb{P}} \newcommand{\probof}[1]{\prob\left\{ #1 \right\}} \newcommand{\pmat}[1]{\begin{pmatrix} #1 \end{pmatrix}} \newcommand{\bmat}[1]{\begin{bmatrix} #1 \end{bmatrix}} \newcommand{\spmat}[1]{\left(\begin{smallmatrix} #1 \end{smallmatrix}\right)} \newcommand{\sbmat}[1]{\left[\begin{smallmatrix} #1 \end{smallmatrix}\right]} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\eye}{\mat{I}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mH}{\mat{H}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mO}{\mat{O}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mPbar}{\bar{\mP}} \newcommand{\ones}{\vec{e}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vf}{\vec{f}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vj}{\vec{j}} \newcommand{\vk}{\vec{k}} \newcommand{\vl}{\vec{l}} \newcommand{\vm}{\vec{l}} \newcommand{\vn}{\vec{n}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\vpi}{\vecalt{\pi}} \newcommand{\vlambda}{\vecalt{\lambda}}$

# Computational Methods in Optimization

## References

Nonlinear optimization
Numerical methods in optimization by
Jorge Nocedal and Stephen J. Wright.
Springer 2006.
Primal-dual interior-point methods by Stephen J. Wright, SIAM 1997.
Linear and Nonlinear Optimization. Igor Griva, Stephen G. Nash, Ariela Sofer SIAM, 2009.
Least squares
Numerical methods for least squares problems by Åke Björck, SIAM 1996.
Convex
Convex optimization by Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.

## Lecture 30

Stochastic and global optimization methods including simulated annealing and Markov chain Monte Carlo.

Global optimization
Julia
Lecture 30 (Global stochastic) (html) (ipynb)

## Lecture 29

Nocedal and Wright Chapter 16
Notes
Lecture 29
Julia
Lecture 29 (Quadratic programming) (html) (ipynb)

## Lecture 28

Overview of derivative free optimization and nonlinear programming nonlinear programming, including

Derivative free optimization: gradient methods, pattern search, Nelder-Mead And Penalty and Augmented Lagrangian methods.

Nocedal and Wright Chapter 17, 18,
Griva, Sofer, Nash, Chapter 16.
Nonlinear programming slides
Nocedal and Wright Chapter 9
Derivative-free optimization
Julia
Lecture 28 (Derivative free) (html) (ipynb)

## Lecture 27

Finish up trust region methods and then dive into large scale optimization.

Large-scale optimization: conjugate gradient, simple algorithms, scalable linear algebra, and limited memory quasi-newton.

Nocedal and Wright Chapter 5
Nocedal and Wright Chapter 7
Griva Sofer Nash, Chapter 13
Large scale optimization handout
Notes
Lecture 27

## Lecture 26

Trust region methods

Nocedal and Wright Chapter 4
Trust region methods
Notes
Lecture 26

## Lecture 25

Quasi-newton methods

Nocedal and Wright Chapter 6
Quasi-newton methods
Notes
Lecture 25

## Lecture 24

Prerecorded videos on nonlinear equations

## Lecture 23

Lecture by your TA on the solutions to the Midterm

## Lecture 22

Guest lecture by Ahmed Sameh on modified Cholesky for Newton's method

## Lecture 21

Matrix calculus and finite difference

Nocedal and Wright, Section 8.1
http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf (Lots of examples, focus on Machine Learning)
Peder Olsen's talk
http://justindomke.wordpress.com/2011/03/24/matrix-calculus/
https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html
http://mathoverflow.net/questions/5209/notions-of-matrix-differentiation
http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf now http://web.archive.org/web/20080830042052/http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf
https://tminka.github.io/papers/matrix/minka-matrix.pdf
http://ccrma.stanford.edu/~dattorro/matrixcalc.pdf
http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf
http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz now https://web-beta.archive.org/web/20111222131039/http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz
Notes
Lecture 21 (Partial notes from Lecture 20...)

## Lecture 20

Finish up IP methods (see the centering step in the code below.) (Also, start Matrix Calc -- see Lecture 21)

Notes
Lecture 20

## Lecture 19

Review of project and intro to interior point methods including the central path, centering steps, and starting points (hopefully!)

Nocedal and Wright Chapter 14
Julia
Lecture 19 (Interior point) (html) (ipynb)
Notes
Lecture 19

The midterm!

## Lecture 17

Review for the Midterm

Notes
Lecture 17
Slides
Lecture 17

## Lecture 16

We proved that backtracking line search converges.

Convergence of line search
Notes
Lecture 16

## Lecture 15

Your TA did a review of homework questions.

## Lecture 14

We saw the wolfe conditions for line search and proved that a point always satisfies them.

Wolfe Conditions
Nocedal and Wright, Lemma 3.1.
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
Julia
Lecture 14 (Line search) (html) (ipynb)
Notes
Lecture 14

## Lecture 13

We finished up the proof that the simplex method decreases the objective and then saw Phase 1 / Crash.

This was it for the SIMPLEX method! Then we went into unconstrained algorithms. We saw an intro to line search methods for optimization

Line search algorithms
Unconstrained algorithms
Nocedal and Wright, Chapter 2, 3
Griva Sofer Nash, Chapter 11
Notes
Lecture 13
Unconstrained algorithms
Line search algorithms
Line search and the Wolfe conditions
Convergence of line search

## Lecture 12

We finished up the simplex method and went over how it works by moving from vertex to vertex.

Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Simple Simplex Code
Julia
Lecture 12 (Simplex) (html) (ipynb)
Notes
Lecture 12

## Lecture 11

We proved the fundamental theorem of linear programming, that if a solution exists, then there is a solution at a vertex of the feasible polytope.

Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
Polytope Geometry
Julia
Lecture 11 (LP Feasible Polytopes) (html) (ipynb)
Lecture 11 (LP solution via Enumeration) (html) (ipynb)
Notes
Lecture 11

## Lecture 10

We saw optimality conditions for linear programs and the dual.

Griva, Sofer, and Nash, chapter 4.
Nocedal and Wright, Chapter 13.
LP optimality
Julia
Lecture 10 (Linear programming) (html) (ipynb)
Notes
Lecture 10

## Lecture 9

We introduced the steepest descent method and shows that it converges at a linear rate on a quadratic objective.

Griva, Sofer, Nash Section 14.4
Steepest Descent
Two Remarks on the Kantorovich Inequality, by P. Henrici.
Julia
Lecture 9 (Steepest Descent) (html) (ipynb)
Notes
Lecture 9

## Lecture 8

We saw the necessary and sufficient conditions for linearly inequality constrained optimization. We also saw the definition of a descent direction and an active set.

Griva, Sofer, Nash Section 14.4
Nocedal and Wright Chapter 2 (for descent directions)
Notes
Lecture 8

## Lecture 7

We saw the necessary and sufficient conditions for linearly constrained optimization and how they give a system of augmented normal equations for constrained least squares. We also saw how to get an initial point for a null-space method.

Griva Sofer Nash, Chapter 14.
Notes
Lecture 7
Julia
Lecture 7 (Constrained Least Squares) (html) (ipynb)

## Lecture 6

We saw algorithms for constrained least squares problems as an intro to constrained optimization.

Björck Chapter 2
Björck Chapter 5
Notes
Lecture 6

## Lecture 5

Multivariate taylor, gradients, Hessians, multivariate optimality conditions, convexity.

Notes
Lecture 5
Nocedal and Wright Section 2.1, you should read Theorem 2.5 for yourself, and review the multivariate generalizations.
Julia
Lecture 5 (Optimality Conditions) (html) (ipynb)

## Lecture 4

We did an intro to 1d optimization.

Chapter 2 in Nocedal and Wright
Notes
Lecture 4
Handouts
unconstrained-intro.pdf
Julia
Lecture 4 (Optim.jl) (html) (ipynb)

## Lecture 3

We covered a quick intro to optimization software.

Julia
Lecture-3-convex_least_squares.jl
Lecture-3-sudoku_gurobi.jl
Notes
Lecture 3

## Lecture 2

We went over sequences and convergence .

Appendix B in Nocedal and Wright
Julia
Lecture 2 (Rates) (html) (ipynb)
Notes
Lecture 2

## Lecture 1

We reviewed the syllabus, and saw the xkcd raptor problem as a motivation to optimization.