Lecture 23 — Nelder-Mead Simplex Method

The simplex as a local model

A simplex in $\mathbb{R}^n$ is a set of $n+1$ non-colinear points. In 2D, it's a triangle; in 3D, a tetrahedron.

Order the vertices by function value:

$$f(\mathbf{x}_1) \le f(\mathbf{x}_2) \le \cdots \le f(\mathbf{x}_{n+1})$$

The simplex gives us a local "linear" model of the function: the function values at the vertices tell us which direction is "downhill."

Finding a search direction

The key insight: the line from the worst point $\mathbf{x}_{n+1}$ through the centroid of the best points is a reasonable search direction.

The centroid of the best $n$ points is:

$$\bar{\mathbf{x}} = \frac{1}{n}\sum_{i=1}^{n} \mathbf{x}_i$$

We search along the parametric line:

$$\bar{\mathbf{x}}(t) = \frac{1}{n}\sum_{i=1}^{n}\mathbf{x}_i + t\left(\mathbf{x}_{n+1} - \frac{1}{n}\sum_{i=1}^{n}\mathbf{x}_i\right)$$

Why can't we just use this direction and be done? Because we need a simplex at the next step too! The new point can't be too far (simplex too big) or too close (simplex degenerates). Nelder-Mead carefully chooses $t$ to keep the simplex well-shaped.

The four operations

Nelder-Mead replaces the worst point $\mathbf{x}_{n+1}$ with a new point $\bar{\mathbf{x}}(t)$ for specific values of $t$:

Operation	$t$	When
Reflect	$-1$	First try — mirror the worst point through the centroid
Expand	$-2$	Reflection was the best point so far — try going further
Outside contract	$-\tfrac{1}{2}$	Reflection worse than second-worst but better than worst
Inside contract	$+\tfrac{1}{2}$	Reflection worse than worst — contract toward centroid
Shrink	—	Nothing worked — shrink all points toward $\mathbf{x}_1$

The decision tree

Every iteration starts by computing the reflection $\bar{\mathbf{x}}(-1)$. What happens next depends on how $f_r = f(\bar{\mathbf{x}}(-1))$ compares to the sorted vertex values $f_1 \le f_2 \le \cdots \le f_{n+1}$:

If $f_1 \le f_r \lt f_n$ — the reflected point is better than the second-worst but not the best. Accept reflection.
If $f_r \lt f_1$ — the reflected point is the new best! Try Expand ($t = -2$) to go even further.
- If $f_e \lt f_r$: accept expansion.
- Otherwise: accept the original reflection.
If $f_n \le f_r \le f_{n+1}$ — the reflected point is worse than the second-worst but no worse than the worst. Try Outside contract ($t = -\tfrac{1}{2}$).
- If $f_c \le f_r$: accept.
- Otherwise: Shrink.
If $f_r \gt f_{n+1}$ — the reflected point is worse than everything, even the worst vertex. Try Inside contract ($t = +\tfrac{1}{2}$).
- If $f_c \lt f_{n+1}$: accept.
- Otherwise: Shrink.

Most steps are reflections, requiring only 1 new function evaluation. Expansion and contraction also need 1 additional evaluation (2 total, counting the reflection). Only shrink needs $n$ evaluations.

A word of caution

"Don't be impressed by 2D demos." In 2D, you could just look at the contour plot and see where the minimum is. The real question is whether these methods work in 10, 50, or 1000 dimensions. Nelder-Mead is practical for moderate $n$ but can stall in high dimensions.