Achieving various forms of fault-tolerance in large-scale distributed
systems---a prime example being the Internet---has become an important
problem. Using a relaxed definition of faults, which includes
temporary congestion of servers and other transient adverse conditions,
we seek to design distributed control algorithms which are resilient
when subject to certain patterns of worst-case faults called k-sparse
errors. Sparsity has a recursive definition which roughly says that
bad events are well-separated in space-time, and space-time locii where
this condition is violated can be covered by small balls which are
then recursively well-separated in space-time at the next scale and
so on. Of intense theoretical interest is the question whether simple
distributed algorithms can be designed which remain fault-tolerant in
the presence of