# Computational methods in optimization

### Lawson B134

$\newcommand{\eps}{\varepsilon} \newcommand{\kron}{\otimes} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\minimize}{minimize} \DeclareMathOperator{\subjectto}{subject to} \newcommand{\mat}[1]{\boldsymbol{#1}} \renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{\vecalt}[1]{\boldsymbol{#1}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\normof}[1]{\|#1\|} \newcommand{\onormof}[2]{\|#1\|_{#2}} \newcommand{\MIN}[2]{\begin{array}{ll} \minimize_{#1} & {#2} \end{array}} \newcommand{\MINone}[3]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\OPTone}{\MINone} \newcommand{\itr}[2]{#1^{(#2)}} \newcommand{\itn}[1]{^{(#1)}} \newcommand{\prob}{\mathbb{P}} \newcommand{\probof}[1]{\prob\left\{ #1 \right\}} \newcommand{\pmat}[1]{\begin{pmatrix} #1 \end{pmatrix}} \newcommand{\bmat}[1]{\begin{bmatrix} #1 \end{bmatrix}} \newcommand{\spmat}[1]{\left(\begin{smallmatrix} #1 \end{smallmatrix}\right)} \newcommand{\sbmat}[1]{\left[\begin{smallmatrix} #1 \end{smallmatrix}\right]} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\eye}{\mat{I}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mH}{\mat{H}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mO}{\mat{O}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mPbar}{\bar{\mP}} \newcommand{\ones}{\vec{e}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vf}{\vec{f}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vj}{\vec{j}} \newcommand{\vk}{\vec{k}} \newcommand{\vl}{\vec{l}} \newcommand{\vm}{\vec{l}} \newcommand{\vn}{\vec{n}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\vpi}{\vecalt{\pi}}$

## References

Nonlinear optimization
Numerical methods in optimization by Jorge Nocedal and Stephen J. Wright. Springer 2006.
Primal-dual interior-point methods by Stephen J. Wright, SIAM 1997.
Least squares
Numerical methods for least squares problems by Åke Björck, SIAM 1996.
Convex
Convex optimization by Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.

## Lecture 28

Stochastic methods in optimization and alternating optmization (block coordinate descent).

http://astrobites.com/2012/02/20/code-you-can-use-the-mcmc-hammer/ A description of how to improve MCMC sampling.
http://vimeo.com/22616409 A video of the Metropolis hastings
http://math.sjtu.edu.cn/faculty/zw2109/paper/RBRBookCh.pdf nice survey of the history of recent results
Globally convergent block-coordinate techniques for unconstrainted optimization by Grippof and Sciandronea.

## Lecture 27

Derivative free optimization: pattern search and Nelder mead.

Matlab

## Lecture 26

Large scale optimization.

Nocedal and Wright Chapter 7 * Sparse matrix methods in optimization by Gill, Murray, Saunders, and Wright.

## Lecture 25

Penalty and augmented Lagrangian methods

Nocedal and Wright Chapter 17

## Lecture 24

Discussion of Active Set methods and projected gradients.

Active set methods for quadratic optimization

Nocedal and Wright Chapter 16
Matlab
matlab/qp_activeset.m
matlab/qp_activeset_step.m

## Lecture 23

Nocedal and Wright Chapter 16

## Lecture 22

The Dual LP and and Interior Point methods

Matlab
matlab/ip_central.m
Nocedal and Wright Chapter 14

## Lecture 21

The Simplex Method

Nocedal and Wright Chapter 13
Matlab
matlab/simplex_step.m
matlab/simplex_iterates.m

## Lecture 20

Guest lecture by Nelson Uhan based on http://www.optimization-online.org/DB_HTML/2011/07/3106.html

## Lecture 19

Geometry of LPs and the simplex method.

## Lecture 18

Constrained optimization theory 3 and linear programs

## Lecture 17

Constrained optimization theory 2

## Lecture 16

Constrained optimization theory 1

## Lecture 15

We talked a lot about the midterm. Also, we saw algorithms for solving nonlinear equations and compared them to algorithms for optimization

Nocedal and Wright Chapter 11
Solving Nonlinear equations with Newton’s method by C. T. Kelley
Codes for Solving Nonlinear equations with Newton’s method by C. T. Kelley

## Lecture 14

Here, we proved that Quasi-Newton methods are “globally convergent” for convex functions and shed some intuition on why they converge superlinearly.

Nocedal and Wright Chapter 6, Theorems 6.5, 6.6

## Lecture 13

This was a day of studying Hessian-free optimization. We began by skirting over the nonlinear conjugate gradient method, and then discussed Quasi-Newton methods in more depth.

Nocedal and Wright Chapter 5, Chapter 6
A quick handout on Quasi-Newton methods

## Lecture 11 and 12

These were combined into a single lecture with a short make-up class.

We saw the Cauchy point and how this can be used as a sufficient decreas-like condition for a trust region method. We also saw how to approximate the solution of the trust region problem more quicky using the dog-leg and subspace minimization methods.

Nocedal and Wright Chapter 4
Matlab
matlab/trust_region_demo.m

## Lecture 10

Basic trust region methods and how they can run into problems with some simple demos.

Nocedal and Wright Chapter 4
A quick handout on trust regions
Matlab
matlab/solve_trust_region_subproblem.m
matlab/trust_region_demo.m
matlab/trust_region_1.m

## Lecture 9

We first studied the Wolfe conditions and how these are useful in global convergence results; and proved that points that satisfy them exist. Then we saw a proof that a line search method is globally convergent.

Nocedal and Wright Chapter 3
Proof of global convergence of line search
Matlab
matlab/fig31fun.m

## Lecture 8

In this lecture, we saw a quick proof that steeepest descent with exact line search is a slow algorithm. We also introduced descent directions, and saw what we’ll need in a line search method.

http://www.math.ucdavis.edu/~hctseng/files/_previous/MAT258/ch02.pdf
Nocedal and Wright Section 3.1, strong Wolfe conditions

## Lecture 9

We saw a proof that a point that always statisfies the strong Wolfe conditions exists. We also proved that backtracking line search combined with a generic optimization algorithm will let us prove that the method globally converges to a point where the gradient vanishes.

Nocedal and Wright Chapter 3
Convergence of backtracking line search methods
Matlab
matlab/fig31fun.m

## Lecture 8

We saw a proof that Steepest Descent converges at a linear rate on a quadratic objective.

We began introducing the strong Wolfe conditions.

http://hpcrd.lbl.gov/~meza/papers/Steepest%20Descent.pdf
Nocedal and Wright Chapter 3

## Lecture 7

In this lecture we’ll see an introduction to Newton methods and learn about the need for line search.

http://hpcrd.lbl.gov/~meza/papers/Steepest%20Descent.pdf
Nocedal and Wright Chapter 2, Chapter 3
Matlab
lecture_simple_algorithms.m
newtons_method_1.m

## Lecture 6

This lecture covered using finite differences to verify the computation of a gradient and also a brief intro to matrix calculus – how to compute gradients, Hessians, and Jacobians of vector and matrix valued functions.

http://justindomke.wordpress.com/2011/03/24/matrix-calculus/
http://people.rit.edu/jcdicsa/courses/SML/01background.pdf
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html
http://mathoverflow.net/questions/5209/notions-of-matrix-differentiation
http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf
http://research.microsoft.com/en-us/um/people/minka/papers/matrix/minka-matrix.pdf
http://ccrma.stanford.edu/~dattorro/matrixcalc.pdf
http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf
http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz
Matlab
lecture_matrix_calculus.m

## Lecture 5

We saw algorithms for least squares problems, and also theory and algorithms for constrained least squares problems.

Matlab
conls.m
conls_lag.m
Sports Example with Constraint
Björck Chapter 2
Björck Chapter 5

## Lecture 4

We saw least squares problems from an optimization point of view, including the conditions under which a problem has a unique solution.

Nocedal and Wright Section 10.2
Björck Chapter 1
Matlab
Sports Example

## Lecture 3

We saw how to formulate a Sudoku puzzle as a binary LP.

We covered the fundamentals of unconstrained optimization and saw: univarite Taylor series, first and second order necessary conditions for univariate functions, and gradients and Hessians.

Nocedal and Wright Section 2.1, you should read Theorem 2.5 for yourself, and review the multivariate generalizations.
Matlab
Sudoku example
sudoku_gurobi.m

## Lecture 2

We saw sequences and rates of convergence. As well as a few demos of optimization software.

Nocedal and Wright Appendix A.2
Matlab
convergence.m
CVX Example
Poblano Example

## Lecture 1

We reviewed the syllabus, and saw the xkcd raptor problem as a motivation to optimization.

See the lecture slides at right.