Graph Coloring and Derivative Computation in Scientific Computing

Graph coloring models of various kinds have proved to be powerful abstractions for minimizing the execution time, memory, and storage space needed in computing sparse derivative matrices using automatic differentiation. Depending on the structure of the derivative matrix of interest---whether it is Jacobian (nonsymmetric) or Hessian (symmetric)---and the specifics of the computational techniques employed, as many as five variants of coloring problems exist. Graph coloring models are also useful in discovering concurrency in parallel scientific computation; examples include iterative methods for sparse linear systems, adaptive mesh refinement, preconditioning, and sparse tiling. We have developed novel sequential as well as parallel algorithms for a variety of coloring problems such as acyclic coloring, star coloring, and distance-2 coloring. ColPack, a software package consisting of implementations of sequential algorithms for coloring and related problems in sparse derivative computation, has been released. Implementations of the parallel algorithms are being developed now.

We are currently developing new algorithms for computing sparse Hessian matrices via the Reverse mode of Algorithmic Differentiation (AD). This approach is based on an Edge Pushing algorithm recently proposed by Gower and Mello, that we extend by viewing the AD computation from the perspective of Live Variables in data flow analysis. The Live Variables approach provides a simpler derivation of the algorithm, can be made more efficient by the use of graph transformations called Preaccumulation in this context, and also lends itself to tighter complexity analyses. Our software LivarH, is available through GitHub. Further, we are extending this approach to compute higher order derivatives, as well as implementing the algorithm in parallel using a new MPI formalism.
Representative Presentation:
   Mu Wang and Alex Pothen, Reverse Hessian Algorithm with Preaccumulation
   The 17th EuroAD Workshop, Argonne, IL, August 2015.
Representative Publication:
   Assefaw H. Gebremedhin, Duc Nguyen, Mostofa Ali Patwary, and Alex Pothen, ColPack: Graph coloring software for derivative computation and beyond, ACM Transactions on Mathematical Software, 40 (1), 30 pp., 2013.
Here is a page that shows
the impact of ColPack and our work on Coloring.

Matching in Scientific Computing

Various kinds of matchings in graphs are needed in several scientific or high-performance computing contexts. Examples include sparse linear systems (numerical preprocessing and block triangular decomposition) and graph partitioning (coarsening phase of multilevel methods). Traditional matching algorithms compute optimal solutions in superlinear time, but current trends are toward algorithms that find approximate solutions of massive graphs in near-linear time. We have developed parallel algorithms based on approximation techniques for edge-weighted matching, and have applied them to the Network Alignment problem. We have also developed parallel algorithms for maximum cardinality matching in bipartite graphs on multithreaded computers.

Furthermore, we have investigated vertex-weighted matchings, b-matchings, and b-edge covers. For the first of these problems, we have developed a 2/3-approximation algorithm for bipartite graphs; for the second, we have developed efficient 1/2-approximation algorithms for shared memory parallel machines by proposing a new b-Suitor algorithm; and for the minimum weight b-edge cover, we are developing a 3/2-approximation algorithm.

Representative Presentation:
   Parallel Graph Algorithms on Emerging Architectures
   SIAM Annual Meeting, 2013.

Representative Publication:
   Mahantesh Halappanavar, Alex Pothen, Fredrik Manne, Ariful Azad, Johannes Langguth and Arif Khan, Codesign Lessons Learned from Implementing Graph Matching Algorithms on Multithreaded Architectures, IEEE Computer, pp. 46-55, August 2015.

Bioinformatics and Computational Proteomics

Flow Cytometry is a platform for profiling the changes in the phenotype of cells due to various stimuli at the single-cell level; the phenotype is determined by the abundances of multiple protein complexes, and is measured by fluorescence. This technology was developed in the 1960s, and is routinely used in clinical settings to measure the response of the immune system to leukemias, Alzheimer's disease, vaccines, etc. FC technology has been changing in recent years to become high content and high-throughput; several tens of proteins can be measured simultaneously, and with plates containing several hundreds of samples can be processed rapidly. Data analytics now lags behind the ability to generate large volumes of FC data. We have worked on various aspects of the data analysis pipeline for FC data: Spectral unmixing computes correct abundances of each protein by correcting for the overlaps of signals from multiple proteins, and this requires Nonnegative matrix factorizations, Bregman's divergence, and nonlinear optimization. Variance Stablilization decouples the dependence of the variance of a cell population from the magnitude of its abundance. We have developed a combinatorial dissimilarity measure to compare FC samples, which are collections of cell populations, represented as clusters in a multi-dimensional space of protein abundances. We have used this dissimilarity measure to register cell populations across large numbers of samples under varying experimental or physilogical conditions. We have also developed the concept of metaclusters to summarize cell populations from similar samples, and the concept of templates to summarize samples with similar cell populations. We have developed dynamic classification algorithms based on templates to classify samples into classes based on the protein expression patterns of the cell populations. We are continuing to develop algorithms and software for the FC pipeline that will scale to large numbers of samples.

Representative Presentation:
   Ariful Azad, Bartek Rajwa, and Alex Pothen, Classifying immunophenotypes for Acute Myeloid Leukemia via templates,
CYTO 2014 (Conference of the International Society for the Advancement of Cytometry)

Representative Publication:
   Ariful Azad, Arif Khan, Bartek Rajwa, Saumyadipta Pyne and Alex Pothen, Classifying immunophenotypes with templates from Flow Cytometry, Proceedings of the ACM Conference on Bioinformatics and Computational Biology (BCB), 10 pages, 2013

Augmented Matrices, Computational Surgery, and Contingency Analysis of the Power Grid

Computational surgery requires interactive visualization of solid finite element models of organs and their deformations as they are being cut. An example is astigmatism surgery of the eye, with the cornea cut to restore the normal curvature of the eye in orthogonal directions. The computational challenge here is to provide 10-100 updates of a system of equations of involving the stiffness matrix of a large mesh consisting of hundreds of thousands of nodes and elements to enable the visualization.
A similar problem arises in contingency analysis of the electric Power Grid. The operators of the Grid are required by law to compute what happens to the Grid when a single generator or transmission line goes down, so that they know the corrective action to take. The problem arises when we want to compute what happens to the Grid when some larger number k of the generators or lines go down. Now the number of scenarios to consider increases even for a grid consisting of a few thousand nodes.
Both these problems can be modeled by an initial state of the mesh or the Grid, described by a large mesh or network, which is then changed by removing or adding a few nodes or edges. This change happens at each cutting step as we remesh around the cut, or in the Power Grid as elements go down. All of the changes can be modeled by adding rows and columns to a 2 by 2 block matrix in such a way that the (1,1) block remains unchanged. Since the changes affect only a relatively small number of mesh elements or network nodes, we solve the Augmented system of equations implicitly (i.e., the matrix is not formed, but the solution is computed using only matrix-vector operations.) Our work shows that the Augmented matrix approach is capable of surgical simulations that could be used in a haptic or graphic simulator. We also reduce the time to compute a solution to the contingency analysis problem by two or more orders of magnitude. Our work pays careful attention to sparsity in the matrices, the solution vectors, and the right-hand-side vectors.

Representative Presentation:
   Alex Pothen and Yu-Hong Yeung Augmented Systems for Fast Updates: Computational Surgery and Power Grids,
Sparse Days in St. Girons III Conference, France July 2015

Representative Publication:
Alex Pothen, Jessica Crouch, Yu-Hong Yeung Interactively Cutting and Constraining Meshes Using Augmented Matrices,
ACM Transactions on Computer Graphics (ACM TOG), 16 pages, 2015. (Accepted subject to revisions)

Combinatorial Scientific Computing and Petascale Simulations Institute

The CSCAPES Institute was a multi-institutional project funded by Advanced Scientific Computing Research (ASCR) Office of the U.S. Department of Energy from 2006-2012. The major goals of the project were to develop parallel algorithms and software for problems in computational science and engineering (CSE) that could be modeled by graphs and hypergraphs. We developed graph and hypergraph partitioning tools to enable load-balancing; parallel algorithms to compute sparse Jacobians and Hessians in Automatic Differentiation via graph coloring; approximation algorithms to compute matchings on parallel computers; and ordering algorithms to improve memory-system performance of sparse graph and matrix computations. Our software tools were used by research teams in Accelerator Physics, Computational Fluid Dynamics and Systems Biology. Thirty researchers from Purdue, Sandia National Labs, Argonne National Lab, Ohio State, Colorado State and Old Dominion University participated in this work.

Project Website:
   CSCAPES Institute