1. A convex optimization approach for identification of human tissue-specific interactomes

    Analysis of organism-specific interactomes has yielded novel insights into cellular function and coordination, understanding of pathology, and identification of markers and drug targets. Genes, however, can exhibit varying levels of cell-type specificity in their expression, and their coordinated expression manifests in tissue-specific function and pathology. Tissue-specific/ selective interaction mechanisms have significant applications in drug discovery, as they are more likely to reveal drug targets. Furthermore, tissue-specific transcription factors (tsTFs) are significantly implicated in human disease, including cancers. Finally, disease genes and protein complexes have the tendency to be differentially expressed in tissues in which defects cause pathology. These observations motivate the construction of refined tissue-specific interactomes from organism-specific interactomes.

    We present a novel technique for constructing human tissue-specific interactomes. Using a variety of validation tests (ESEA, GO Enrichment, Disease-Gene Subnetwork Compactness), we show that our proposed approach significantly outperforms state of the art techniques. Finally, using case studies of Alzheimer's and Parkinson's diseases, we show that tissue-specific interactomes derived from our study can be used to construct pathways implicated in pathology, and demonstrate the use of these pathways in identifying novel targets.

    Goto project page >>

  2. Triangular Alignment (TAME): A Tensor-based Approach for Higher-order Network Alignment

  3. Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches to alignment of biomolecular networks aim to maximize edge conservation (as a measure of topological similarity), as well as prior information on the underlying similarity of aligned entities (e.g., known similarity of nodes). We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures, and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is not scalable to real-world networks. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem.

    Based on this formulation, we present an algorithm called Triangular AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in different classes of biological networks -- however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the synthetic NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.

    Goto project page >>

  4. A Critical Survey of Deconvolution Methods for Separating cell-types in Complex Tissues

  5. Identifying properties and concentrations of components from an observed mixture, known as deconvolution, is a fundamental problem in signal processing. It has diverse applications in fields ranging from hyperspectral imaging to denoising readings from biomedical sensors. This paper focuses on in-silico deconvolution of signals associated with complex tissues into their constitutive cell-type specific components, along with a quantitative characterization of the cell-types. Deconvolving mixed tissues/cell-types is useful in the removal of contaminants (e.g., surrounding cells) from tumor biopsies, as well as in monitoring changes in the cell population in response to treatment or infection. In these contexts, the observed signal from the mixture of cell-types is assumed to be a convolution, using a linear instantaneous (LI) mixing process, of the expression levels of genes in constitutive cell-types. The goal is to use known signals corresponding to individual cell-types along with a model of the mixing process to cast the deconvolution problem as a suitable optimization problem.

    In this paper, we present a survey and in-depth analysis of models, methods, and assumptions underlying deconvolution techniques. We investigate the choice of the different loss functions for evaluating estimation error, constraints on solutions, preprocessing and data filtering, feature selection, and regularization to enhance the quality of solutions, along with the impact of these choices on the performance of commonly used regression-based methods for deconvolution. We assess different combinations of these factors and use detailed statistical measures to evaluate their effectiveness. Some of these combinations have been proposed in the literature, whereas others represent novel algorithmic choices for deconvolution. We identify shortcomings of current methods and avenues for further investigation. For many of the identified shortcomings, such as normalization issues and data filtering, we provide new solutions. We summarize our findings in a prescriptive step-by-step process, which can be applied to a wide range of deconvolution problems.

    Goto project page >>