Homework 2

$\newcommand{\eps}{\varepsilon} \newcommand{\kron}{\otimes} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\minimize}{minimize} \DeclareMathOperator{\subjectto}{subject to} \newcommand{\mat}[1]{\boldsymbol{#1}} \renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{\vecalt}[1]{\boldsymbol{#1}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\normof}[1]{\|#1\|} \newcommand{\onormof}[2]{\|#1\|_{#2}} \newcommand{\MIN}[2]{\begin{array}{ll} \minimize_{#1} & {#2} \end{array}} \newcommand{\MINone}[3]{\begin{array}{ll} \minimize_{#1} & {#2} \\ \subjectto & {#3} \end{array}} \newcommand{\OPTone}{\MINone} \newcommand{\itr}[2]{#1^{(#2)}} \newcommand{\itn}[1]{^{(#1)}} \newcommand{\prob}{\mathbb{P}} \newcommand{\probof}[1]{\prob\left\{ #1 \right\}} \newcommand{\pmat}[1]{\begin{pmatrix} #1 \end{pmatrix}} \newcommand{\bmat}[1]{\begin{bmatrix} #1 \end{bmatrix}} \newcommand{\spmat}[1]{\left(\begin{smallmatrix} #1 \end{smallmatrix}\right)} \newcommand{\sbmat}[1]{\left[\begin{smallmatrix} #1 \end{smallmatrix}\right]} \newcommand{\RR}{\mathbb{R}} \newcommand{\CC}{\mathbb{C}} \newcommand{\eye}{\mat{I}} \newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mD}{\mat{D}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mH}{\mat{H}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mO}{\mat{O}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}} \newcommand{\mZ}{\mat{Z}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mPbar}{\bar{\mP}} \newcommand{\ones}{\vec{e}} \newcommand{\va}{\vec{a}} \newcommand{\vb}{\vec{b}} \newcommand{\vc}{\vec{c}} \newcommand{\vd}{\vec{d}} \newcommand{\ve}{\vec{e}} \newcommand{\vf}{\vec{f}} \newcommand{\vg}{\vec{g}} \newcommand{\vh}{\vec{h}} \newcommand{\vi}{\vec{i}} \newcommand{\vj}{\vec{j}} \newcommand{\vk}{\vec{k}} \newcommand{\vl}{\vec{l}} \newcommand{\vm}{\vec{l}} \newcommand{\vn}{\vec{n}} \newcommand{\vo}{\vec{o}} \newcommand{\vp}{\vec{p}} \newcommand{\vq}{\vec{q}} \newcommand{\vr}{\vec{r}} \newcommand{\vs}{\vec{s}} \newcommand{\vt}{\vec{t}} \newcommand{\vu}{\vec{u}} \newcommand{\vv}{\vec{v}} \newcommand{\vw}{\vec{w}} \newcommand{\vx}{\vec{x}} \newcommand{\vy}{\vec{y}} \newcommand{\vz}{\vec{z}} \newcommand{\vpi}{\vecalt{\pi}}$

Homework 2

Please answer the following questions in complete sentences in a typed manuscript and submit the solution to me in class on January 31th, 2012.

In this homework, we probe a bit of theory around the topics we’ve covered in class so far. The goal is not to provide tedious work, but useful results that’ll rely upon later. For instance, the first problem tackles subspaces and null-spaces, which will frequently arise in our discussion of constrained optimization. The second problem asks you use the tools in class to discuss a variation on least squares that often arises in many applications. The third problem deals with convexity of a subspace in relationship to the constrained least squares problem. Finally, the fourth problem shows a different way of looking at a least squares problem.

Problem 1: Constrained least squares theory

Let $A \in \RR^{m \times n}$ and $B \in \RR^{p \times n}$. Consider the constrainted least squares problem:

1. Show that $\text{rank}(\bmat{A \\ B}) = n$ is equivalent to the condition that the intersection between the null-spaces of $A$ and $B$ only has the trivial vector $\vx = 0$.

2. Show that the problem may not have any solution if $\text{rank}(\mB) < p$.

3. Show that the problem always has a unique solution, for any $\vd$, when $\text{rank}(\mB) = p$ and $\text{rank}(\bmat{A \\ B}) = n$.

4. Show that if the problem has a unique solution, for any $\vd$, then $\text{rank}(\bmat{A \\ B}) = n$. (You can combine this with the proof of the last one if you wish.)

Problem 2: Ridge Regression

The Ridge Regression problem is a variant of least squares:

This is also known as Tikhonov regularization.

1. Show that this problem always has a unique solution, for any $\mA$ if $\lambda > 0$, using the theory discussed in class so far.

This property is one aspect of the reason that Ridge Regression and used. It is also a common regularization method that can help avoid overfitting in a regression problem.

1. Use the SVD of $\mA$ to characterize the solution as a function of $\lambda$.

2. What is the solution when $\lambda \to \infty$?
What is the solution when $\lambda \to 0$.

Problem 3: Convexity

1. Show that $f(x) = \| \vb - \mA \vx \|$ is a convex function. Feel free to use the result proved on the last homework.

2. Show that the null-space of a matrix is a convex set.

3. Explain why these results allow us to conclude that constrained least squares is a convex optimization problem. (See Wikipedia for the definition of a convex problem.)

Problem 4: Alternate formulations of Least Squares

Consider the constrained least squares problem:

where $\mC \in \RR^{m \times n}$, $n \le m$ and rank $n$.

1. Convert this problem into into the standard constrained least squares form.

2. Form the augmented system from the Lagrangian as we did in class.

3. Manipulate this problem to arrive at the normal equations for a least-squares problem: $\mC^T \mC \vy = \mC^T \vb$.
Discuss any advantages of the systems at intermediate steps.