# Homework 5

$\newcommand{\eps}{\varepsilon} \newcommand{\kron}{\otimes} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\minimize}{minimize} \DeclareMathOperator{\subjectto}{subject to} \newcommand{\mat}[1]{\boldsymbol{#1}} \renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{\vecalt}[1]{\boldsymbol{#1}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\normof}[1]{\|#1\|} \newcommand{\onormof}[2]{\|#1\|_{#2}} \newcommand{\MIN}[2]{\begin{array}{ll} \minimize_{#1} & {#2} \end{array}} \newcommand{\MINone}[3]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\OPTone}{\MINone} \newcommand{\itr}[2]{#1^{(#2)}} \newcommand{\itn}[1]{^{(#1)}} \newcommand{\prob}{\mathbb{P}} \newcommand{\probof}[1]{\prob\left\{ #1 \right\}} \newcommand{\pmat}[1]{\begin{pmatrix} #1 \end{pmatrix}} \newcommand{\bmat}[1]{\begin{bmatrix} #1 \end{bmatrix}} \newcommand{\spmat}[1]{\left(\begin{smallmatrix} #1 \end{smallmatrix}\right)} \newcommand{\sbmat}[1]{\left[\begin{smallmatrix} #1 \end{smallmatrix}\right]} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\eye}{\mat{I}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mH}{\mat{H}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mO}{\mat{O}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mPbar}{\bar{\mP}} \newcommand{\ones}{\vec{e}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vf}{\vec{f}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vj}{\vec{j}} \newcommand{\vk}{\vec{k}} \newcommand{\vl}{\vec{l}} \newcommand{\vm}{\vec{l}} \newcommand{\vn}{\vec{n}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\vpi}{\vecalt{\pi}}$

# Homework 6

Please answer the following questions in complete sentences in a typed manuscript and submit the solution to me in class on February 28th, 2012.

## Problem 0: List your collaborators.

Please identify anyone, whether or not they are in the class, with whom you discussed your homework. This problem is worth 1 point, but on a multiplicative scale. (Note that collaboration is not allowed on the bonus question below.)

## Problem 1: Implement a Quasi-Newton method

1. Nocedal and Wright state the BFGS update to the inverse Hessian as:

Describe the amount of work required for this update.

2. Create a new function called quasi_newton_bfgs.m based on newtons_method_1.m and implement a Quasi-Newton method with the BFGS update to the inverse Hessian, use the Strong Wolfe line search routine from the Poblano toolbox. Make sure your function has an appropriate help description (the comments at the top of the file) and a reasonable set of parameters. There is a good chance you’ll use this code later in the course. This function should record:

• number of function evaluations in iteration i
• the time taken from the start until the end of iteration i
• the value of the function at iteration i
• the infinity norm of the gradient at iteration i
• the size of the step at iteration i
1. Modify the gradient_descent_1.m code to use the same line search routine.

2. I’ve provided a list of problems in matlab/testfuncs.zip.
Look at the file driver_eval_methods.m. It shows how to loop over all the test cases. For test case ii, function_min_values(ii) is the exact minimizer, and the code creates f and g to evaluate the function and gradient at a point; also x0 is the starting value. Use matlab/fcombine.m to join f and g into a single function for use in your routines above.

Show how many function evaluations, iterations, and seconds is required to find a point where the infinity norm of the gradient is smaller than $10^{-5}$ for each problem.

## Problem 2: A little theory

According to the book, the SR1 BFGS update is a special case of the Broyden class of quasi-Newton updates (pages 150, 151). Show that this is the case.

## Problem 3: Comparing optimization codes.

BONUS We are getting close to the midterm. This is a bonus question worth 15 points of extra-credit.

• You may not collaborate on this question.
• You must complete all parts of this question.

We’ve discussed optimization algorithms frequently in this class. So far, we have seen:

• Steepest Descent with a line-search
• Newton’s Method with a line-search
• Hessian based Trust Region methods
• Non-linear CG methods with a line search
• DFP Quasi-Newton method with a line search
• BFGS Quasi-Newton method with a line search
• SR1 Quasi-Newton method with a line search Given two methods, how should we compare them?
1. Please fill in the blank in the following question, keeping in mind the requirements on this bonus question.

“I –fill this in– this problem entirely by myself and did not discuss with anyone.”

2. Read the following paper http://www.tops-scidac.org/papers/DM_benchmark.pdf

3. Implement a function yourself to compute and display a performance profile. I’ll post the best solution to the class web-page as a reference.

4. Show a performance profile for number of function evaluations for the steepest descent and quasi-newton methods computed in problem 1.