Large-Scale Optimization

What breaks when $n$ is a million? When you can exploit structure, and when you need a whole new method: limited-memory quasi-Newton.

Part 1

Why Large-Scale is Hard

What makes optimization "large-scale"? Newton's method at $n = 10^5$: memory walls, cubic time, and a live quiz. Little-$o$ vs big-$O$ for function evaluations.

Start here →

Part 2

Structured Hessians

A show-and-tell: log-barrier diagonals, banded, arrowhead, sparse QPs, low-rank-plus-diagonal. When your $\mathbf{H}$ has structure, scalable linear algebra wins.

Explore →

Part 3

The main event: BFGS without storing the matrix. Derive the two-loop recursion step-by-step. Interactive demos of the algorithm, $m$-sensitivity, $\mathbf{T}_0$ scaling, and a head-to-head vs full BFGS.

Discover →

CS 520 · Spring 2026 · Based on Nocedal & Wright Ch. 7 and Griva/Sofer/Nash Ch. 13

Large-Scale Optimization

Why Large-Scale is Hard

Structured Hessians

Limited-Memory BFGS