\documentclass[]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
\else % if luatex or xelatex
  \ifxetex
    \usepackage{mathspec}
  \else
    \usepackage{fontspec}
  \fi
  \defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\usepackage{hyperref}
\hypersetup{unicode=true,
            pdftitle={Homework 7},
            pdfborder={0 0 0},
            breaklinks=true}
\urlstyle{same}  % don't use monospace font for urls
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
}
\setlength{\emergencystretch}{3em}  % prevent overfull lines
\providecommand{\tightlist}{%
  \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{0}
% Redefines (sub)paragraphs to behave more like sections
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi

\title{Homework 7}

\input{preamble.tex}
\title{Homework}



\begin{document}
\maketitle

Please answer the following questions in complete sentences in a clearly
prepared manuscript and submit the solution by the due date on
Gradescope.

Remember that this is a graduate class. There may be elements of the
problem statements that require you to fill in appropriate assumptions.
You are also responsible for determining what evidence to include. An
answer alone is rarely sufficient, but neither is an overly verbose
description required. Use your judgement to focus your discussion on the
most interesting pieces. The answer to ``should I include `something' in
my solution?'' will almost always be: Yes, if you think it helps support
your answer.

\subsection{Problem 0: Homework
checklist}\label{problem-0-homework-checklist}

\begin{itemize}
\item
  Please identify anyone, whether or not they are in the class, with
  whom you discussed your homework. This problem is worth 1 point, but
  on a multiplicative scale.
\item
  Make sure you have included your source-code and prepared your
  solution according to the most recent Piazza note on homework
  submissions.
\end{itemize}

\subsection{Problem 1}\label{problem-1}

(Adapted from Nocedal and Wright 3.1, typos are my own) Backtracking
line search is a simple method that attempts to balance sufficient
decrease with a long step length by testing a sequence of points:
\[ \alpha = 1, 1/2, 1/4, \ldots \] and accepting the first point that
satisfies sufficient decrease (which is the condition that
\(L(\alpha) \le c_1 \alpha L'(0)\) for the line search function
\(L(\alpha)\) we discussed in class).

\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
  Implement a backtracking line search routine to select the step length
  and add this to our simple gradient descent code from the lectures.
  (Note, you are free to implement gradient descent using your own code
  too, adapting our code is only suggested for convenience.)
\item
  Prepare a plot of the step lengths (values of \(\alpha_k\) that were
  accepted) when we run gradient descent with this line search on the
  Rosenbrock function staring from the point \((1.2, 1.2)\) and also the
  point \((-1.2,1)\).
\item
  Implement Newton's method using line search as well. (Don't worry
  about singularity in the Hessian or any of the real issues that come
  up with Newton -- just use the search direction
  \(\mH(\vx_k) \vp = -\vg_k\).
\item
  Prepare a plot of the step lengths as in part 2.
\item
  Discuss any notable differences or similarities.
\item
  Prompt an LLM to implement a line search method that satisfies the
  Wolfe or strong Wolfe conditions (your choice). Test your linesearch
  method in the previous two scenarios. Does it work? How many times
  does this need to query your function \(f\) and gradient \(g\)?
\end{enumerate}

\subsection{Problem 2}\label{problem-2}

(Exercise 4.2 in Kochenderfer and Wheeler) The first Wolfe condition
requires \(f(\vx + \alpha \vp) \le f(\vx) + c_1 \alpha \vp^T \vg\). What
is the maximum step length \(\alpha\) that satisfies this condition
given that
\(f(x) = 5 + x_1^2 + x_2^2, \vx = [-1,1]^T, \vp = [1,0]^T, c_1 = 10^{-4}\).

\subsection{Problem 3}\label{problem-3}

(Modified from Griva, Sofer, and Nash 11.5.8.) Consider the function
\[ f(\vx) = -\vq^T \vx  + \sum_{j} y_j \log (\mC^T \vx)_j \] where
\(\mC\) is a very large matrix and computing \(\mC^T \vx\) is expensive.
Show how we can use the structure of the function to reduce the number
of matrix vector products \(\mC^T \vx\) that would be required for
checking the Wolfe conditions when searching in a direction \(\vp\).
(Hint, compute \(\vw = \mC^T \vp\) once, and see how to re-use it.)

\subsection{Problem 4}\label{problem-4}

Read section 3.5 in Nocedal and Wright on step-length selection. Then
solve problem 3.13 (typos are mine).

\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
  In the notation we have from class where \(L(\alpha)\) is the function
  projected into the current search direction, then show that the
  quadratic function that interpolates \(L(0), L'(0),\) and
  \(L(\alpha_0)\) is given by (3.57), i.e.~
  \[ \phi_q(\alpha)_q = \left( \frac{L(\alpha_0) - L(0) - \alpha_0 L'(0)}{\alpha_0^2}\right) \alpha^2 + L'(0) \alpha + \phi(0). \]
\item
  Let \(\alpha_1\) be the minimizer of this quadratic. Suppose that the
  sufficient decrease condition is not satisfied at \(\alpha_0\). Then
  show that \(\alpha_1 < \frac{\alpha_0}{2(1-c_1)}.\)
\end{enumerate}

\subsection{Optional Problem 5}\label{optional-problem-5}

Ask an LLM to adapt our proof from class to use the Wolfe or Strong
Wolfe conditions instead of backtracking line search. Closely analyze
the resulting proof. Is it correct? Ask the LLM to develop
visualizations that help explain the proof like an 3blue1brown video.
Does it succeed?

\end{document}
