## CS Theory/Math Seminar

Oganizers: Elena Grigorescu and Yi Wu

 Friday Jan 31 Abram Magner Purdue U HAAS 101, 11am- 12. Title: Expected external profile of Patricia tries. Abstract: We consider PATRICIA tries on n random binary strings generated by a memoryless source with parameter p >=1/2. For both the symmetric (p=1/2) and asymmetric cases, we derive asymptotic expansions for the expected value of the external profile at level k=k(n), defined to be the number of leaves at level k. For the purposes of this talk, we focus on the range in which the expected profile grows polynomially, in which analytic techniques, including Mellin transforms, analytic depoissonization, and the saddle point method, can be applied. We will see how the path compression property in PATRICIA leads to interesting challenges not seen in the analysis of tries or digital search trees. This is joint work with Charles Knessl (Dept. Math. Stat. & Comp. Sci., University of Illinois at Chicago) and Wojciech Szpankowski. Friday Feb 7 Sebastian Cioaba U of Delaware HAAS 101, 11am-12. Title: Spectral graph theory Abstract: While they do not determine the graph in general, the eigenvalues of a graph capture many important structural properties. In this talk, I will describe some classical and more recent connections between eigenvalues and combinatorial properties of graphs. Friday Feb 28 Peng Zhang Purdue U HAAS 101, 11am-12. Title: Optimal query complexity for estimating the trace of a matrix Abstract: Given an implicit matrix A with oracle access x^TA x for any x\in \R^n, we study the query complexity of randomized algorithms for estimating the trace of the matrix. This problem has many applications in quantum physics, streaming algorithms, machine learning, and pattern matching. Two metrics are commonly used for evaluating the estimators: i) variance; ii) a high probability multiplicative-approximation guarantee. This talk will be focused on three issues: (1)In the broad class of linear nonadaptive estimators (which subsumes all the existing known estimators), we propose an estimator that is provably the minimum variance unbiased estimator. (2)We show that any estimator requires \Omega(1/\eps) queries to have a guarantee of variance at most \eps. (3)We show that any estimator requires \Omega(\frac{1}{\eps^2}\log \frac{1}{\delta}) to achieve a (1\pm\eps)-multiplicative approximation guarantee with probability at least 1-\delta. Joint work with Karl Wimmer and Yi Wu. Tue Mar 4 Kyle Kloster Purdue U HAAS 101, 3-4 pm (Note unusual day/time) Title:A sub-linear method for computing columns of functions of sparse matrices Abstract: Functions such as the exponential, the resolvent, and $p^{\text{th}}$ roots are used to analyze sparse matrices from large social networks. We present a fast algorithm, related to the Gauss-Seidel linear solver, for approximating a column of these matrix functions. Assuming a power-law degree distribution on the underlying graph, we can show that the method is sub-linear for these functions, and as a corollary we have that such columns must be local. For real-world networks with over 1 billion edges, it runs in less than 10 seconds on a standard desktop computer. Joint work with David Gleich. Wed Mar 26 Kenneth Clarkson IBM Almaden CS COLLOQUIUM. Room 3102A/B, 1:30-2:30 pm. Title: Sketching Algorithms for Numerical Linear Algebra Abstract: Many tasks in machine learning and statistics ultimately end up being problems involving matrices: whether you're matching lenders and loans in the microfinance space, or finding the key players in the bitcoin market, or inferring where tweets came from, you'll want to have a toolkit for low-rank matrix approximation, least-squares and robust regression, eigenvector analysis, CUR and non-negative matrix factorizations, and other matrix computations. *Sketching* is a way to compress matrices that preserves key matrix properties; it can be used to speed up many matrix computations. Sketching takes a given matrix A and produces a sketch matrix B that has fewer rows and/or columns than A. For a good sketch B, if we solve a problem with input B, the solution will also be pretty good for input A. For some problems, sketches can also be used to get faster ways to find high-precision solutions to the original problem. In other cases, sketches can be used to summarize the data by identifying the most important rows or columns. I will describe some sketching techniques, including recent fast ones, and some theoretical results for applying sketching to robust regression. I'll touch on experimental results and efforts to implement sketching on clusters for large-scale application. Bio: Ken Clarkson is Senior Manager of the CS theory group at IBM Almaden in San Jose, and earlier was a researcher at Bell Labs, and even earlier a student at Stanford and Pomona College. He has worked mainly in geometric algorithms, and introduced a general framework for applying randomization to geometric algorithms that is related to, but different from, the VC dimension, with applications to linear progamming, nearest neighbor searching, and building geometric structures incrementally. Other work includes algorithms for search in metric spaces, for determinants with low relative error, for a version of non-negative matrix factorization, and for asymptotically optimal meshing. More recently he has worked on geometric algorithms related to machine learning, and randomized algorithms for numerical linear algebra. Sat May 3 Many speakers Midwest Midwest theory day @ Purdue