Assefaw Gebremedhin, Papers on Sparse Derivative Computation



Title : ColPack: Software for Graph Coloring and Related Problems in Scientific Computing
Authors: A.H. Gebremedhin, D. Nguyen, M.M.A. Patwary and A. Pothen
Status: ACM Transactions on Mathematical Software, In Press, 2013.

Abstract

We present a suite of fast and effective algorithms, encapsulated in a software package called ColPack, for a variety of graph coloring and related problems. Many of the coloring problems model partitioning needs arising in compression-based computation of Jacobian and Hessian matrices using automatic differentiation. Several of the coloring problems also find important applications in many areas outside derivative computation, including frequency assignment in wireless networks, scheduling, facility location, and concurrency discovery and data movement operations in parallel and distributed computing. The presentation in this article includes a high-level description of the various coloring algorithms within a common design framework, a detailed treatment of the theory and efficient implementation of known as well as new vertex ordering techniques upon which the coloring algorithms rely, a discussion of the package's software design, and an illustration of its usage. The article also includes an extensive experimental study of the major algorithms in the package using real-world as well as synthetically generated graphs.

Download paper in PDF

Title : Exploiting Sparsity in Automatic Differentiation on Multicore Architectures
Authors: B. Letschert, K. Kulshreshtha, A. Walther, D. Nguyen, A.H. Gebremedhin and A. Pothen
Status: In S. Forth et al. (Eds.), Recent Advances in Algorithmic Differentiation, Lecture Notes in Computational Science and Engineering 87, DOI 10.1007/978-3-642-30023-3_14, 2012, Springer.

Abstract

We discuss the design, implementation and performance of algorithms suitable for the efficient computation of sparse Jacobian and Hessian matrices using Automatic Differentiation via operator overloading on multicore architectures. The procedure for exploiting sparsity (for runtime and memory efficiency) in serial computation involves a number of steps. Using nonlinear optimization problems as test cases, we show that the algorithms involved in the various steps can be adapted to multithreaded computations.

Download paper in PDF

Title : Implementation of Partial Separability in a Source to Source Transformation AD Tool
Authors: S.H.K. Narayanan, B. Norris, P. Hovland and A. Gebremedhin
Status: In S. Forth et al. (Eds.), Recent Advances in Algorithmic Differentiation, Lecture Notes in Computational Science and Engineering 87, DOI 10.1007/978-3-642-30023-3_31, 2012, Springer.

Abstract

A significant number of large optimization problems exhibit structure known as partial separability, for example, least squares problems, where elemental functions are gathered into groups that are then squared. The sparsity of the Jacobian of a partially separable function can be exploited by computing the smaller Jacobians of the elemental functions and then assembling them into the full Jacobian. We implemented partial separability support in ADIC2 by using pragmas to identify partially separable function values, applying source transformations to subdivide the elemental gradient computations, and using the ColPack coloring toolkit to compress the sparse elemental Jacobians. We present experimental results for an elastic-plastic torsion optimization problem from the MINPACK-2 test suite.

Download paper in PDF

Title : Sparse Jacobian Computation using ADIC2 and ColPack
Authors: S.H.K. Narayanan, B. Norris, P. Hovland, D. Nguyen and A. Gebremedhin
Status: Procedia Computer Science, 4:2115 -- 2123, 2011. Proceedings of the International Conference on Computational Science, ICCS 2011.

Abstract

Many scientific applications benefit from the accurate and efficient computation of derivatives. Automatically generating these derivative computations from an application's source code offers a competitive alternative to other approaches, such as less accurate numerical approximations or labor-intensive analytical implementations. ADIC2 is a source transformation tool for generating code for computing the derivatives (e.g., Jacobian or Hessian) of a function given the C or C++ implementation of that function. Often the Jacobian or Hessian is sparse and presents the opportunity to greatly reduce storage and computational requirements in the automatically generated derivative computation. ColPack is a tool that compresses structurally independent columns of the Jacobian and Hessian matrices through graph coloring approaches. In this paper, we describe the integration of ColPack coloring capabilities into ADIC2, enabling accurate and efficient sparse Jacobian computations. We present performance results for a case study of a simulated moving bed chromatography application. Overall, the computation of the Jacobian by integrating ADIC2 and ColPack is approximately 40% faster than a comparable implementation that integrates ADOL-C and ColPack when the Jacobian is computed multiple times.

Download paper in PDF

Title: Distributed-memory Parallel Algorithms for Distance-2 Coloring and Related Problems in Derivative Computation
Authors: D. Bozdag, U. Catalyurek, A. Gebremedhin, F. Manne, E. Boman and F. Ozgunner
Status: SIAM Journal on Scientific Computing, Vol 32, Issue 4, pp 2418--2446, 2010.

Abstract

The distance-2 graph coloring problem aims at partitioning the vertex set of a graph into the fewest sets consisting of vertices pairwise at distance greater than two from each other. Its applications include derivative computation in numerical optimization and channel assignment in radio networks. We present efficient, distributed-memory, parallel heuristic algorithms for this NP-hard problem as well as for two related problems used in the computation of Jacobians and Hessians. Parallel speedup is achieved through graph partitioning, speculative (iterative) coloring, and a BSP-like organization of parallel computation. Results from experiments conducted on a PC cluster employing up to 96 processors and using large-size real-world as well as synthetically generated test graphs show that the algorithms are scalable. In terms of quality of solution, the algorithms perform remarkably well---the number of colors used by the parallel algorithms was observed to be very close to the number used by the sequential counterparts, which in turn are quite often near optimal. Moreover, the experimental results show that the parallel distance-2 coloring algorithm compares favorably with the alternative approach of solving the distance-2 coloring problem on a graph $\G$ by first constructing the square graph $\G^2$ and then applying a parallel distance-1 coloring algorithm on $\G^2$. Implementations of the algorithms are made available via the Zoltan load-balancing library.

Download paper in PDF

Title : Efficient Computationof Sparse Hessians Using Coloring and Automatic Differentiation
Authors: A. Gebremedhin, A. Pothen, A. Tarafdar, and A. Walther
Status: INFORMS Journal on Computing, Vol 21, No 2, pp 209-223, 2009.

Abstract

The computation of a sparse Hessian matrix H using automatic differentiation (AD) can be made efficient using the following four-step procedure. The coloring variant used in the second step depends on whether the recovery in the fourth step is direct or indirect : a direct method uses star coloring and an indirect method uses acyclic coloring . In an earlier work, we had designed and implemented effective heuristic algorithms for these two NP-hard coloring problems. Recently, we integrated part of the developed software with the AD tool ADOL-C, which has recently acquired a sparsity detection capability. In this paper, we provide a detailed description and analysis of the recovery algorithms, and experimentally demonstrate the efficacy of the coloring techniques in the overall process of computing the Hessian of a given function using ADOL-C as an example of an AD tool. We also present new analytical results on star and acyclic coloring of chordal graphs. The experimental results show that sparsity exploitation via coloring yields enormous savings in runtime and makes the computation of Hessians of very large size feasible. The results also show that evaluating a Hessian via an indirect method is often faster than a direct evaluation. This speedup is achieved without compromising numerical accuracy.

Download paper in PDF

Title : Exploiting Sparsity in Jacobian Computation via Coloring and Automatic Differentiation:
A Case Study in a Simulated Moving Bed Process
Authors: A. Gebremedhin, A. Pothen and A. Walther
Status: In C. Bischof et al. (Eds): Proc. of AD2008, Lecture Notes in Computational Science and Engineering 64, pp 339-349, 2008, Springer.

Abstract

Using a model from a chromatographic separation process in chemical engineering, we demonstrate that large, sparse Jacobians of fairly complex structures can be computed accurately and efficiently by using automatic differentiation (AD) in combination with a four-step procedure involving matrix compression and de-compression. For the detection of sparsity pattern (step 1), we employ a new operator overloading-based implementation of a technique that relies on propagation of index domains. To obtain the seed matrix to be used for compression (step 2), we use a distance-2 coloring of the bipartite graph representation of the Jacobian. The compressed Jacobian is computed using the vector forward mode of AD (step 3). A simple routine is used to directly recover the entries of the Jacobian from the compressed representation (step 4). Experimental results using ADOL-C show that the runtimes of each of these steps is in complete agreement with theoretical analysis, and the total runtime is found to be only about a hundred times the time needed for evaluating the function itself. The alternative approach of computing the Jacobian without exploiting sparsity is infeasible.

Download paper in PDF

Title : New Acyclic and Star Coloring Algorithms with Applications to Hessian Computation
Authors: A. Gebremedhin, A. Tarafdar, F. Manne, and A. Pothen
Status: SIAM Journal on Scientific Computing, Vol 29, No 3, pp 1042--1072, 2007.

Abstract

Acyclic and star coloring problems are specialized vertex coloring problems that arise in the efficient computation of Hessians using automatic differentiation or finite differencing, when both sparsity and symmetry are exploited. We present an algorithmic paradigm for finding heuristic solutions for these two NP-hard problems. The underlying common technique is the exploitation of the structure of two-colored induced subgraphs. For a graph G on n vertices and m edges, the time complexity of our star coloring algorithm is O(n d_2) , where d_k , a generalization of vertex degree, denotes the average number of distinct paths of length at most k edges starting at a vertex in G . The time complexity of our acyclic coloring algorithm is larger by a multiplicative factor involving the inverse of Ackermann's function. The space complexity of both algorithms is O(m) . To the best of our knowledge, our work is the first practical algorithm for the acyclic coloring problem. For the star coloring problem, our algorithm uses fewer colors and is considerably faster than a previously known O(n d_3) time algorithm. Computational results from experiments on various large-size test graphs demonstrate that the algorithms are fast and produce highly effective solutions. The use of these algorithms in Hessian computation is expected to reduce overall runtime drastically.

Download paper in PDF

Title : What Color Is Your Jacobian? Graph Coloring for Computing Derivatives
Authors : A. Gebremedhin, F. Manne, and A. Pothen
Status: SIAM Review, Vol 47, No 4, pp 629--705, 2005.

Abstract

Graph coloring has been employed since the 1980s to efficiently compute sparse Jacobian and Hessian matrices using either finite differences or automatic differentiation. Several coloring problems occur in this context, depending on whether the matrix is a Jacobian or a Hessian, and the specifics of the computational techniques employed. We consider eight variant vertex coloring problems here. This article begins with a gentle introduction to the problem of computing a sparse Jacobian, followed by an overview of the historical development of the research area. Then we present a unifying framework for the graph models of the variant matrix estimation problems. The framework is based upon the viewpoint that a partition of a matrix into structurally orthogonal groups of columns corresponds to distance-2 coloring an appropriate graph representation. The unified framework helps integrate earlier work and leads to fresh insights; enables the design of more efficient algorithms for many problems; leads to new algorithms for others; and eases the task of building graph models for new problems. We report computational results on two of the coloring problems to support our claims. Most of the methods for these problems treat a column or a row of a matrix as an atomic entity, and partition the columns or rows (or both). A brief review of methods that do not fit these criteria is provided. We also discuss results in discrete mathematics and theoretical computer science that intersect with the topics considered here.

Download paper in PDF
Go to
  • Research Areas
  • Coloring and Automatic Differentiation
  • Parallel Algorithms
  • Parallel Computation Models