Journal publications

Paul G. Constantine, David F. Gleich, and Gianluca Iaccarino. A factorization of the spectral Galerkin system for parameterized matrix equations: derivation and applications. SIAM Journal of Scientific Computing, 33(5):2995-3009, 2011. [ bib | DOI | local ]

Recent work has explored solver strategies for the linear system of equations arising from a spectral Galerkin approximation of the solution of PDEs with parameterized (or stochastic) inputs. We consider the related problem of a matrix equation whose matrix and right-hand side depend on a set of parameters (e.g., a PDE with stochastic inputs semidiscretized in space) and examine the linear system arising from a similar Galerkin approximation of the solution. We derive a useful factorization of this system of equations, which yields bounds on the eigenvalues, clues to preconditioning, and a flexible implementation method for a wide array of problems. We complement this analysis with (i) a numerical study of preconditioners on a standard elliptic PDE test problem and (ii) a fluids application using existing CFD codes; the MATLAB codes used in the numerical studies are available online.

David F. Gleich, Ying Wang, Xiangrui Meng, Farnaz Ronaghi, Margot Gerritsen, and Amin Saberi. Some computational tools for digital archive and metadata maintenance. BIT Numerical Mathematics, 51:127-154, 2011. [ bib | DOI | local ]

Computational tools are a mainstay of current search and recommendation technology. But modern digital archives are astonishingly diverse collections of older digitized material and newer born digital content. Finding interesting material in these archives is still challenging. The material often lacks appropriate annotation-or metadata-so that people can find the most interesting material. We describe four computational tools we developed to aid in the processing and maintenance of large digital archives. The first is an improvement to a graph layout algorithm for graphs with hundreds of thousands of nodes. The second is a new algorithm for matching databases with links among the objects, also known as a network alignment problem. The third is an optimization heuristic to disambiguate a set of geographic references in a book. And the fourth is a technique to automatically generate a title from a description.

Paul G. Constantine and David F. Gleich. Random alpha PageRank. Internet Mathematics, 6(2):189-236, September 2010. [ bib | DOI | local | http ]

We suggest a revision to the PageRank random surfer model that considers the influence of a population of random surfers on the PageRank vector. In the revised model, each member of the population has its own teleportation parameter chosen from a probability distribution, and consequently, the ranking vector is random. We propose three algorithms for computing the statistics of the random ranking vector based respectively on (i) random sampling, (ii) paths along the links of the underlying graph, and (iii) quadrature formulas. We find that the expectation of the random ranking vector produces similar rankings to its deterministic analogue, but the standard deviation gives uncorrelated information (under a Kendall-tau metric) with myriad potential uses. We examine applications of this model to web spam.

David F. Gleich, Andrew P. Gray, Chen Greif, and Tracy Lau. An inner-outer iteration for PageRank. SIAM Journal of Scientific Computing, 32(1):349-371, February 2010. [ bib | DOI | local ]

We present a new iterative scheme for PageRank computation. The algorithm is applied to the linear system formulation of the problem, using inner-outer stationary iterations. It is simple, can be easily implemented and parallelized, and requires minimal storage overhead. Our convergence analysis shows that the algorithm is effective for a crude inner tolerance and is not sensitive to the choice of the parameters involved. The same idea can be used as a preconditioning technique for nonstationary schemes. Numerical examples featuring matrices of dimensions exceeding 100,000,000 in sequential and parallel environments demonstrate the merits of our technique. Our code is available online for viewing and testing, along with several large scale examples.

Paul G. Constantine, David F. Gleich, and Gianluca Iaccarino. Spectral methods for parameterized matrix equations. SIAM Journal on Matrix Analysis and Applications, 31(5):2681-2699, 2010. [ bib | DOI | local ]

We apply polynomial approximation methods—known in the numerical PDEs context as spectral methods—to approximate the vector-valued function that satisfies a linear system of equations where the matrix and the right-hand side depend on a parameter. We derive both an interpolatory pseudospectral method and a residual-minimizing Galerkin method, and we show how each can be interpreted as solving a truncated infinite system of equations; the difference between the two methods lies in where the truncation occurs. Using classical theory, we derive asymptotic error estimates related to the region of analyticity of the solution, and we present a practical residual error estimate. We verify the results with two numerical examples.

Qiqi Wang, David F. Gleich, Amin Saberi, Nasrollah Etemadi, and Parviz Moin. A Monte Carlo method for solving unsteady adjoint equations. Journal of Computational Physics, 227(12):6184-6205, June 2008. [ bib | DOI | local ]

Traditionally, solving the adjoint equation for unsteady problems involves solving a large, structured linear system. This paper presents a variation on this technique and uses a Monte Carlo linear solver. The Monte Carlo solver yields a forward-time algorithm for solving unsteady adjoint equations. When applied to computing the adjoint associated with Burgers� equation, the Monte Carlo approach is faster for a large class of problems while preserving sufficient accuracy.

David F. Gleich and Marzia Polito. Approximating personalized PageRank with minimal use of webgraph data. Internet Mathematics, 3(3):257-294, December 2007. [ bib | DOI | local ]

In this paper, we consider the problem of calculating fast and accurate approximations to the personalized PageRank score of a webpage. We focus on techniques to improve speed by limiting the amount of web graph data we need to access. Our algorithms provide both the approximation to the personalized PageRank score as well as guidance in using only the necessary information—and therefore sensibly reduce not only the computational cost of the algorithm but also the memory and memory bandwidth requirements. We report experiments with these algorithms on web graphs of up to 118 million pages and prove a theoretical approximation bound for all. Finally, we propose a local, personalized web-search system for a future client system using our algorithms.