\documentclass[]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\else % if luatex or xelatex
\ifxetex
\usepackage{mathspec}
\makeatletter % undo the wrong changes made by mathspec
\let\RequirePackage\original@RequirePackage
\let\usepackage\RequirePackage
\makeatother
\else
\usepackage{fontspec}
\fi
\defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\usepackage{hyperref}
\hypersetup{unicode=true,
pdftitle={Homework 6},
pdfborder={0 0 0},
breaklinks=true}
\urlstyle{same} % don't use monospace font for urls
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
}
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{0}
% Redefines (sub)paragraphs to behave more like sections
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
\title{Homework 6}
\input{preamble.tex}
\title{Homework}
\begin{document}
\maketitle
\section{Homework 6}\label{homework-6}
Please answer the following questions in complete sentences in a clearly
prepared manuscript and submit the solution by the due date on
Blackboard (around Sunday, October 21th, 2018.)
Remember that this is a graduate class. There may be elements of the
problem statements that require you to fill in appropriate assumptions.
You are also responsible for determining what evidence to include. An
answer alone is rarely sufficient, but neither is an overly verbose
description required. Use your judgement to focus your discussion on the
most interesting pieces. The answer to ``should I include `something' in
my solution?'' will almost always be: Yes, if you think it helps support
your answer.
\subsection{Problem 0: Homework
checklist}\label{problem-0-homework-checklist}
\begin{itemize}
\item
Please identify anyone, whether or not they are in the class, with
whom you discussed your homework. This problem is worth 1 point, but
on a multiplicative scale.
\item
Make sure you have included your source-code and prepared your
solution according to the most recent Piazza note on homework
submissions.
\end{itemize}
\subsection{Problem 1: Estimating real-world performance for
elimination}\label{problem-1-estimating-real-world-performance-for-elimination}
In class, we showed how to count the number of floating point operations
involved in computing an LU decomposition. Here, we are going to seek a
few ways to make this count more realistic. More generally, this type of
study would be called performance analysis. But in this case, we'll just
make a few simple observations.
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
Repeat the analysis of the LU for class, but account for the number of
floating point operations in light of a single ``fused-multiply-add''
operation that is now available on various Intel CPUs.
\url{https://en.wikipedia.org/wiki/Multiply\%E2\%80\%93accumulate_operation}
(assume that you also have a fused-multiply subtract if it's easier)
\item
Having the count of FLOPs is only relevant if all floating point
operations have the same cost. Develop an estimate for the relative
cost of the following floating point operations: \texttt{+},
\texttt{-}, \texttt{*}, \texttt{/}, \texttt{sqrt}, \texttt{sin},
\texttt{abs} My code to do this involves generating random numbers
between {[}-1,1{]}, and then performing the operation on the random
numbers. Then we subtract off the time just to generate the random
numbers. The key is not to store anything in intermediate variables
otherwise, you'll benchmark the memory system. Now, assume that the
cheapest flop costs \(1\), what is the number of effective flops in
terms of real-world cost of these operations?
\end{enumerate}
\subsection{Problem 2: Using
accelerators}\label{problem-2-using-accelerators}
A recent innovation are the presence of large ``tensor-cores'' that
support ``matrix multiplication'' directly on GPUs or other CPU
accelerators. These can make matrix-mulplication operations much much
faster than expected. We are going to use them to solve a linear system
via block-substitution.
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
Write down the steps of an algorithm to solve \(\mA \vx = \vb\) via
recursive block-elimination. That is, divide \(\mA\) into blocks:
\[ \mA = \bmat{\mA_1 & \mA_2 \\ \mA_3 & \mA_4} \] where all blocks
have the same size. In this case, assume that \(\mA\) has a
power-of-two dimension so that you can do this recursively as well
until you get down to a \(1\times 1\) system. Then if
\[ \vx = \bmat{\vx_1 \\ \vx_2}, \qquad \vb = \bmat{\vb_1 \\ \vb_2} \]
are the vectors that conform to the partitioning of \(\mA\), develop
an algorithm that uses as a subroutine:
\begin{verbatim}
# return the matrix after solving k linear systems $A^{-1} B$
function solve(A::Matrix, B::Matrix)
end
\end{verbatim}
and matrix-matrix multiplication to solve the linear system.
\item
Count the number of floating point operations involved in the
algorithm from step 1 in terms of flops that occur in
matrix-multiplication and those that occur in other steps. This will
require you to solve for the recurrences involved in the recursive
algorithm.
\end{enumerate}
\subsection{Problem 3: Implement a face
search}\label{problem-3-implement-a-face-search}
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
Load the Yale face database of 165 pictures from the data on our
website:
\begin{itemize}
\tightlist
\item
\url{http://www.cs.purdue.edu/homes/dgleich/cs515-2018/homeworks/Yale_64.csv}
\item
\url{http://www.cs.purdue.edu/homes/dgleich/cs515-2018/homeworks/Yale_64_ids.csv}
\end{itemize}
This can be read into Julia via
\begin{verbatim}
A = readdlm("Yale_64.csv",',')
labels = readdlm("Yale_64_ids.csv")
\end{verbatim}
More about the data is here:
\url{http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html} This
was converted from the Matlab file into the CSV files.
Prepare and plot a heatmap/picture of image 67.
\item
We are going to use all of the even numbered images as a database
\(\mD\). How accureately can we represent the odd-numbered images in
terms of the even numbered images in a least squares sense? (That is,
suppose we consider representing an image \(\mT\) What is the image
with the worst reconstruction? What is the image with the best
reconstruction?
\item
How well does least-squares image search tolerate errors? Suppose we
preturb an image with random normal noise as follows:
\[ \vv' = \vv + \sigma N(0,1) \] where \(N(0,1)\) is a vector a random
normal variables and \(\sigma\) is the scale parameter. (This is
equivalent to using a normal with \(\sigma^2\) as the variance.)
Suppose that we use all images as the database \(\mD\), and we fix an
image, say, number \(67\). How large does \(\sigma\) need to be before
we won't recognize image \(67\) as image \(67\) anymore? How large
does \(\sigma\) need to be before we won't recognize this as person
\(7\) anymore? Suppose that we define recognition failing as when at
least \(25/100\) random trials identify another person. (Hint: I
strongly recommend using a QR factorization of the image database to
make this experiment solvable in a small amount of time, feel free to
use Julia's built in QR factorization.)
\end{enumerate}
\subsection{Problem 4: Givens
vs.~Householder}\label{problem-4-givens-vs.householder}
(Based on class discussion.) In this question, you'll study some of the
questions that came up in class. Given a vector \(\va\), we can use a
single Householder transform to produce an orthogonal matrix \(\mH\)
such that \(\mH \va = \pm \normof{\va} \ve_1\). We can also use a set of
\(n-1\) givens rotations:
\[ \mG_{n-1} \mG_{n-2} \cdot \mG_1 \va = \pm \normof{\va} \ve_1. \]
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\item
Compare (in theory) the orthogonal matrices that arise from these two
ways of looking at the problem for a length \(2\) vector.
\item
Compare (in theory) the orthogonal matrices that arise from these two
ways of looking at the problem for a length \(3\) vector.
\item
State a conjecture or conclusion from the results of part 1 and 2.
\item
Note that the matrix \(\mH\) only depends on knowing the vector that
determines the Householder reflector. Also, note that the matrices
\(\mG_i\) only depend on knowing \(\cos \theta_i\) and
\(\sin \theta_i\). Analyze the number of floating point operations
required to compute \(\vv\) in Householder compared with the cosines
and sines in the Givens rotations.
\end{enumerate}
\end{document}