Global Optimization & Meta-heuristics

Genetic algorithms, ant colonies, and friends — viewed as structured random exploration of a combinatorial space.

The fantasy of "global"

Local optimization gives you a $\mathbf{x}^\star$ where $\nabla f(\mathbf{x}^\star) = 0$. Global optimization promises more: the best $\mathbf{x}^\star$ in the entire feasible set. For a generic non-convex $f$ on $[0,1]^n$, there is no algorithm that finds the true global minimum without, in the worst case, looking everywhere. "Looking everywhere" is exponential in $n$.

So when someone says they have a global optimization method, they mean one of two things:

"Eventually visits every point" is not an algorithm. Pure uniform random search has the same property and is the trivial baseline. The interesting question for any meta-heuristic is: does it find good solutions quickly on the problems I actually care about?

The unifying frame: structured random exploration

Almost every meta-heuristic is a randomized greedy procedure on a (typically combinatorial) search space. Strip away the metaphor and you find the same three pieces:

PieceWhat it does
A neighborhood / mutation operatorGenerates new candidate solutions from old ones — often randomly, often local.
A bias toward better solutionsSelection, acceptance, pheromone reinforcement, fitness-proportionate sampling, etc.
A way to escape local optimaTemperature, mutation rate, crossover, tabu list, restarts.

Once you see this skeleton, the bestiary becomes much shorter:

Simulated annealing single chain + cooling

Random-walk MCMC with a temperature that decays to zero. Already covered in Part 1. The "escape" mechanism is high temperature early on; the "bias" is the Metropolis acceptance ratio.

Genetic algorithms population + recombination

A population of candidate solutions evolves by selection (better solutions reproduce more), crossover (combine two solutions), and mutation (random perturbation). The "escape" mechanism is mutation plus diversity in the population. Try the demo below.

Ant colony optimization indirect memory

Many "ants" each construct a candidate solution by random walk on a graph, biased by pheromone levels on edges. Edges in good solutions get more pheromone; pheromone evaporates over time. Originally for the traveling salesman; now applied to vehicle routing and scheduling. The "memory" is in the shared environment, not the agents.

Particle swarm optimization population + velocity

Particles fly through the search space with velocities pulled toward the best point each particle has seen and the best the swarm has seen. Conceptually a leaky momentum-based sampler with social information.

Tabu search explicit memory

Greedy local search that keeps a short list of recently visited solutions and forbids revisiting them. The "tabu list" is what prevents the search from cycling. Especially used on combinatorial problems with cheap neighborhood moves (e.g., swap two edges).

You can write down a 50-line skeleton common to all of them: maintain state, propose a perturbation, score, accept or update memory, repeat. The names differ; the math is mostly bookkeeping.

A genetic algorithm on a multi-modal landscape

Watch a population (the dots) evolve on a bumpy 2D function. Each generation: tournament selection picks parents, a convex-combination crossover creates a child, and a Gaussian mutation jostles it.

A skeptical view

I'll be upfront: I'm fairly skeptical of meta-heuristics as a general optimization technology. The skepticism boils down to four points.

  1. The metaphors are doing a lot of work. Calling a stochastic local search "evolution" or an "ant colony" doesn't add explanatory power; it adds vocabulary. Stripped of metaphor, most of these are stochastic local search with a perturbation rule.
  2. Convergence guarantees are weak. "Will find the optimum given infinite time" describes uniform random search too. Finite-time convergence rates are typically not available, or are problem-specific.
  3. Comparisons in the literature are noisy. A new algorithm beats baselines on a benchmark; the baselines were tuned poorly; nobody re-runs with budget control. "X outperforms Y on this benchmark" rarely transfers.
  4. If you have problem structure, use it. A linear program, an integer program, a convex relaxation, or even a good local method with random restarts will usually beat a meta-heuristic when the structure is known. Meta-heuristics are most defensible when you have a black-box objective and no structure to exploit.
Hedge: I am not an expert on these methods. There are subareas (CMA-ES for continuous black-box optimization, modern evolutionary strategies for reinforcement learning, ant colony for some combinatorial problems) where the techniques are competitive and well-studied. The Wikipedia articles on metaheuristic, genetic algorithm, ant colony optimization, simulated annealing, and CMA-ES are good entry points. Form your own view.

When meta-heuristics actually help

The honest case for them:

What none of them buy you is a global guarantee in finite time. If you find someone selling that, look closely at the assumptions.