# Homework 3

$\newcommand{\eps}{\varepsilon} \newcommand{\kron}{\otimes} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\minimize}{minimize} \DeclareMathOperator{\subjectto}{subject to} \newcommand{\mat}[1]{\boldsymbol{#1}} \renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{\vecalt}[1]{\boldsymbol{#1}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\normof}[1]{\|#1\|} \newcommand{\onormof}[2]{\|#1\|_{#2}} \newcommand{\MIN}[2]{\begin{array}{ll} \minimize_{#1} & {#2} \end{array}} \newcommand{\MINone}[3]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\OPTone}{\MINone} \newcommand{\itr}[2]{#1^{(#2)}} \newcommand{\itn}[1]{^{(#1)}} \newcommand{\prob}{\mathbb{P}} \newcommand{\probof}[1]{\prob\left\{ #1 \right\}} \newcommand{\pmat}[1]{\begin{pmatrix} #1 \end{pmatrix}} \newcommand{\bmat}[1]{\begin{bmatrix} #1 \end{bmatrix}} \newcommand{\spmat}[1]{\left(\begin{smallmatrix} #1 \end{smallmatrix}\right)} \newcommand{\sbmat}[1]{\left[\begin{smallmatrix} #1 \end{smallmatrix}\right]} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\eye}{\mat{I}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mH}{\mat{H}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mO}{\mat{O}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mPbar}{\bar{\mP}} \newcommand{\ones}{\vec{e}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vf}{\vec{f}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vj}{\vec{j}} \newcommand{\vk}{\vec{k}} \newcommand{\vl}{\vec{l}} \newcommand{\vm}{\vec{l}} \newcommand{\vn}{\vec{n}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\vpi}{\vecalt{\pi}}$

# Homework 3

Please answer the following questions in complete sentences in a typed manuscript and submit the solution to me in class on February 7th, 2012.

## Problem 1: Checking Hessians and matrix calculus

In class, we mentioned that it was vital to double-check gradient calculations, and we then used the gradientcheck.m function to do this. Of course, if we want to utilize Newton’s method to solve a problem, then we need to check the Hessian matrix as well.

1. Implement hessiancheck.m to verify that the Hessian is correct. You may reuse the function gradientcheck.m

2. Determine and verify the Hessian of $f(\vx) = e^{-\vx^T \mA \vx}$ where $\mA$ is not necessarily symmetric or positive definite. Show the result of hessiancheck for a set of test matrices $\mA$ and test points $\vx$. Be adversarial in your choice. Analyze any erratic behavior your find.

3. Determine and verify the Hessian of $f(\vx) = \ve^T \cos(\vx)$. Show the result of hessiancheck for a set of test matrices test points $\vx$. Be adversarial in your choice. Analyze any erratic behavior your find.

4. Determine and verify the Hessian of $f(\bmat{\vx \\ \vy}) = \trace( \bmat{ \vx & \vy } \bmat{ \vx^T \\ \vy^T } )$. Show the result of hessiancheck for a set of test matrices test points $\vx$. Be adversarial in your choice. Analyze any erratic behavior your find.

## Problem 2: Accuracy of finite differences

In class, your professor mentioned that using a step size of the square root of machine precision ($\sqrt{2^{-52}} \approx 10^{-8}$) was a reasonable compromise when computing finite differences. The tension is between the error in the function evaluation and the error in the finite difference approximation.

1. For the function $f(x) = \frac{1}{2} \vx^T \vx$, show (i.e. use matlab to compute) how the error varies in a forward finite difference approximation of the gradient when $\vx \in \RR^{10}$ as the step-length approximation changes from $h \approx 2^{-52}$ to $h \approx 2^{-8}$. What do you observe?

2. Does this change if you switch to a centered finite difference formula?

## Problem 3: Steepest descent

(Nocedal and Wright, Exercise 3.6) Let’s conclude with a quick problem to show that steepest descent can converge very rapidly! Consider the steepest descent method with exact line search for the function $f(\vx) = (1/2) \vx^T \mQ \vx - \vx^T \vb$. Suppose that we know $\vx_0 - \vx^*$ is parallel to an eigenvector of $\mQ$. Show that the method will converge in a single iteration.