Compilers and Programming Interface for High-End Computing
Principal Investigators:
Zhiyuan Li, Ananth Grama and Ahmed Sameh
Participating Students:
Lixia Liu, Russel Meyers, Dasarath Weeratunge
Current Sponsors: National Science Foundation
Department of Computer Science continues its cutting-edge investigation
of compiler techniques for high-end computing systems.
Compiler prototypes are developed to support experimentation
of advanced compiler techniques.
Such prototypes are also used as tools for the study
of various performance issues
concerning high-end application programs and high-end computing systems.
Some of the recent results:
- New loop constructs designed to support development of
asynchronous algorithms for relaxed synchronization
[ LCPC08 , to appear]
- Software tools and performance models for memory behavior analysis
on parallel programs.
( [ACM-ICS08-1.pdf] ).
- Interprocedural analysis, predicate analysis, and array dataflow analysis
to uncover parallell operations [IEEESE]
- Compiler-automated program transformation
to improve data locality, reduce memory requirement
and enhance cache performance.
- Efficient use of register files
(
[ACM-ICS08-2.pdf] )
- Data communication analysis for task partitioning on
distributed systems
[SC2003]
- Cache-sensitive parallel-task partitioning
[PACT1999]
- Smart page mapping to use heterogeneously partitioned caches
[ISPASS2004]
and
[LCTES2005]
- Compiler automated software scheme for computation reuse
[CGO2004]
Drs. Zhiyuan Li, Ananth Grama, and Ahmed Sameh
are also investigating a programming interface that supports the development
of loosely synchronized numerical algorithms to overcome memory and
communication latencies on high-end computing systems.
Experiments show that, when applied properly, large sparse-matrix problems
derived from PDEs can be solved significantly faster by more relaxed
dataflow and synchronization than using the strict dataflow model
in the original algorithms. As an example,
this figure shows the good speedup (over a single processor) achieved
by the new method applied to a two-dimensional multi-grid problem,
in contrast to the poor speedup obtained by the original code.
Notice that the convergence rates of both versions are nearly the same,
as this figure shows.
Back to Li's home page