Research Associates: Mo Mu, T. Drashansky, E. A. Vavalis
Sponsor: Advanced Research Projects Agency
This project has two main technical thrusts:
1. Creating new algorithms that are highly parallel and that reflect the natural parallelism of the physical world.
We aim for algorithms that are quite general in both equations and geometry. We propose to use collocation methods with modest degree polynomial bases (e.g., cubic splines, Hermite cubics, etc.). Collocation gives the flexibility to handle general problems and has been shown to be efficient and robust in practice (even though analysis of this method is more difficult than others). Polynomial bases of degrees 3 and 5 provide superior accuracy and efficiency compared to the more commonly used linear polynomial basis. With collocation as the underlying PDE technique, we then apply domain decomposition techniques and iteration.
2. Automating the process of parallelizing the computation and distributing the work to many processors.
At the first level, we partition the computations at a high level for a machine with many similar processors (e.g., a hypercube machine with, say, a 1000 processors). We gather information about the entire computation from the high-level PDE solving system. We then make good estimates of the work for each major step of the problem solution. This information is used as input to a fast heuristic algorithm to partition the computation for efficient execution.
At the second level, we distribute the computation's major modules over different machines in the computing environment. A multi-threaded (or agent based) approach is advocated for load balancing using many more threads (agents) than processors. Our analysis suggests this provides opportunities for simple, dynamic load balancing techniques to be both effective and inexpensive (at least for very large problems). For example, the user interface and problem formulation might run on an ordinary workstation, the linear system might be solved on a 1000-node hypercube, and the solution examined (various cross sections plotted) on a sophisticated graphics terminal attached to a mini-supercomputer.