53rd Midwest Theory Day, Fall 2006, Purdue University
Abstracts
Speaker: Rahul Shah, 11:00 - 11:30am
Title: Compressed Text Indexing and Range Searching
Abstract: We introduce two transformations {\tt Text2Points} and
{\tt Points2Text} that, respectively, convert text to points in
space and vice-versa. With these transformations, data
structural problems in pattern matching and geometric range searching can
be linked. We show strong connections between space versus query time
trade-offs in these fields. Thus, the results in range searching can be
applied to compressed indexing and vice versa. In particular, we
show that for a given equivalent space, pattern matching queries can
be done using 2-D range searching and vice-versa with query times
within a factor of $O(\log n)$ of each other. This two-way
connection enables us not only to design new data structures for
compressed text indexing, but also to derive new lower bounds.
For compressed text indexing, we propose alternative data
structures based on our {\tt Text2Points} transform and 4-sided
orthogonal query structures in 2-D. Currently, all proposed
compressed text indexes are based on the Burrows-Wheeler transform (BWT)
or its inverse~\cite{GroVit05,FerMan05,GGV03,Sad03,FMMN04}. We
observe that our {\tt Text2Points} transform is related to BWT on
blocked text, and hence we also call it \emph{geometric} BWT. With
this variant, we solve some well-known open problems in this area of
compressed text indexing. In particular, we present the first
external memory results for compressed text indexing. We give the
first compressed data structures for position-restricted pattern
matching~\cite{MakNav06,HSV06-tech}. We also show lower bounds for
these problems and for the problem of text indexing in general.
These are the first known lower bounds (hardness results) in this
area.
(Joint work with Jeff Vitter, Wing-Kai Hon, Yu-Feng Chien)
Speaker: Christine Cheng, 11:30am - 12noon
Title: On computing the distinguishing numbers of planar graphs and beyond: a counting approach
Abstract: A vertex k-labeling of graph G is distinguishing if the only automorphism that preserves
the labels of G is the identity map. The distinguishing number of G, D(G), is the smallest integer k for which G has
a distinguishing k-labeling. First introduced by Albertson and Collins in 1996, their work has since been extended
in many directions by researchers for graphs and groups.
In this talk, we describe techniques for counting the number of
inequivalent distinguishing k-labelings of a graph. Along the way, we prove that the distinguishing number
of a planar graph can be computed in time polynomial in the size of the graph.
This is joint work with V. Arvind and Nikhil Devanur.
Speaker: Aravind Srinivasan, 1pm - 2pm
Title: Approximation Algorithms via Dependent Randomized Rounding
Abstract: Randomized rounding is a very useful technique in
the design of approximation algorithms. In recent years, it
has been understood that carefully-chosen dependencies can
lead to significantly improved randomized-rounding-based
approximation algorithms in various domains. I will present
an introduction to this area, along with various applications
in combinatorial optimization.
Speaker: Chandra Chekuri, 2.30pm - 3pm
Title: Maximizing a Submodular Set Function subject to
a Matroid Constraint
Abstract: Let f be a non-decreasing submodular set function on
a ground set N and let (N, I) be a matroid where
I is the set of independent sets of the matroid.
We consider the problem of finding an independent
set S in I to maximize f(S). A greedy algorithm
achieves a ratio of 1/2 for this problem. On the other
hand a simple special case is the maximum coverage problem
for which no 1-1/e approximation is possible unless P=NP.
We improve the ratio of 1/2 to 1-1/e when f is a sum of
weighted rank functions of matroids. This class of
submodular functions is natural and captures several
interesting problems including set system coverage
type ones. We also apply this framework to obtain a 1-1/e
approximation for the generalized assignment problem
and some non-trivial variants.
Our main tools are the pipage rounding technique of
Ageev and Sviridenko and a probabilistic lemma on
submodular functions. The talk will be self contained
and expository.
Joint work with Gruia Calinescu, Martin Pal and Jan Vondrak.
Speaker: Xiaoyang Gu, 3.00pm - 3.30pm
Title: Points on Computable Curves
Abstract: The “analyst’s traveling salesman theorem” of geometric measure theory
characterizes those subsets of Euclidean space that are contained in curves of
finite length. This result, proven for the plane by Jones (1990) and extended to
higher-dimensional Euclidean spaces by Okikiolu (1992), says that a bounded set
K is contained in some curve of finite length if and only if a certain “square
beta sum”, involving the “width of K” in each element of an infinite system of
overlapping “tiles” of descending size, is finite. In this paper we
characterize those points of Euclidean space that lie on computable curves of
finite length. We do this by formulating and proving a computable extension of
the analyst’s traveling salesman theorem. Our extension, the computable
analyst’s traveling salesman theorem, says that a point in Euclidean space lies
on some computable curve of finite length if and only if it is “permitted” by
some computable “Jones constriction”. A Jones constriction here is an explicit
assignment of a rational cylinder to each of the above-mentioned tiles in such a
way that, when the radius of the cylinder corresponding to a tile is used in
place of the “width of K” in each tile, the square beta sum is finite. A point
is permitted by a Jones constriction if it is contained in the cylinder assigned
to each tile containing the point. The main part of our proof is the
construction of a computable curve of finite length traversing all the points
permitted by a given Jones constriction. Our construction uses the main ideas of
Jones’s “farthest insertion” construction, but takes a very different form,
because, having no direct access to the points permitted by the Jones
constriction, our algorithm must work exclusively with the constriction itself.
Speaker: Eric Schwabe, 3.30pm - 4.00pm
Title: Optimizing Large Data Transfers in Parity-Declustered Data Layouts
Abstract: Disk arrays allow faster access to users’ data by distributing the data across several disks and allowing parallel access.
The use of parity-declustered layouts allows disk arrays to achieve fault-tolerance that satisfies a general trade-off between
failure recovery time and the amount of space dedicated to redundant information. It has been shown that in most cases,
two performance conditions typically considered for parity-declustered data layouts cannot be jointly achieved,
limiting these layouts’ ability to take advantage of available parallelism for large data transfers.
We present improved performance guarantees for large transfers by showing how to approximately achieve both of
these performance conditions for all array configurations.
(Joint work with Ian Sutherland.)
Speaker: Dan Cranston, 4.30pm - 5.00pm
Title: List-coloring the Square of a Subcubic Graph
Abstract: The {\em square} $G2$ of a graph $G$ is the graph with the same vertex set as $G$ and
with two vertices adjacent if their distance in $G$ is at most 2.
Wegner conjectured that for a planar graph $G$ with maximum degree $\Delta(G)=3$ we have $\chi(G2)\leq 7$.
Kostochka and Woodall conjectured that for every graph, the list-chromatic number of $G2$ equals the chromatic
number of $G2$, that is $\chi_l(G2)=\chi(G2)$ for all $G$. If both conjectures are true, together they imply that
every planar graph $G$ with $\Delta(G)=3$ satisfies $\chi_l(G2)\leq 7$. We prove that every graph (not necessarily planar)
with $\Delta(G)=3$ other than the Petersen graph satisfies $\chi_l(G2)\leq 8$ (and this is best possible).
In addition, we show that if $G$ is a planar graph with $\Delta(G)=3$ and girth $g(G)\geq 7$, then $\chi_l(G2)\leq 7$.
Dvo\u{r}\'{a}k, \u{S}krekovski, and Tancer showed that if $G$ is a planar graph with $\Delta(G) = 3$ and
girth $g(G) \geq 10$, then $\chi_l(G2)\leq 6$. We improve the girth bound to show that:
if $G$ is a planar graph with $\Delta(G)=3$ and $g(G) \geq 9$, then $\chi_l(G2) \leq 6$.
All of our proofs can be easily translated into linear-time coloring algorithms.
Speaker: Matt Goto, 5.00pm - 5.30pm
Title: Finding Highest-Scoring Forbidden-Pairs Paths With Variable Vertex Weights
Abstract: In the de novo peptide sequencing problem, output data from a mass spectrometer is used to determine the peptide whose
fragmentation yielded that output spectrum. Candidate peptides are determined by finding
forbidden-pairs paths in a spectrum graph constructed from the mass spectrometer data, assigning
scores to vertices and/or edges in order to favor higher-scoring peptides. Chen et al gave an algorithm
to find the highest-scoring forbidden-pairs path in such a graph. However, in some scoring models, vertex
scores may vary depending on which incident edges are used in the path, ruling out the use of this algorithm. We give an
algorithm to solve the highest-scoring forbidden-pairs paths problem when vertex scores can vary in this way that runs in
Theta(n2 d3) time on a graph with n forbidden pairs and a vertex degree of d.
(Joint work with Eric Schwabe.)
Speaker: Tomek Czajka, 5.30pm - 6.00pm
Title: Improved Random Graph Isomorphism
Abstract: Canonical labeling of a graph consists of assigning a unique
label to each vertex such that the labels are invariant under
isomorphism. Such a labeling can be used to solve the graph
isomorphism problem. We give a simple, linear time, high
probability algorithm for the canonical labeling of a $G(n,p)$
random graph for $p \in \left[\w{\ln^4 n / n\ln\ln n}, 1 -
\w{\ln^4 n / n\ln\ln n}\right]$. Our result covers a gap in the
range of $p$ in which no algorithm was known to work with high
probability. Together with a previous result by Bollobas, the
random graph isomorphism problem can be solved efficiently for $p
\in [\T{\ln n/n}, 1-\T{\ln n/n}]$.
(Joint work with Gopal Pandurangan.)