# netalign : Network Alignment codes

### Mohsen Bayati, David F. Gleich, Amin Saberi, Ying Wang

Supporting codes for the manuscript: Message Passing Algorithms for Sparse Network Alignment.

This package contains datasets and programs to solve network alignment problems written in the Matlab language.

These codes are research prototypes and may not work for you. No promises.

## Overview

The package is organized by directory

data
datasets for the experiments
experiments
script files that use and evaluate the algorithms
matlab
code for all the main algorithms
private_data
a historically named directory for our data files

## Usage

With a recent version of Matlab (2008b used for development), the best way to use the package is to open Matlab, navigate to the matlab directory, and then execute the following commands to solve the network alignment problem for Figure 1.

% load A, B, and L for figure 1

% create S, w, and a variant of L from graph A, B, and L
>> [S,w,li,lj] = netalign_setup(A,B,L);

% use S, w, L, and alpha=0, beta=1 and the bp algorithm
>> x = netalign_bp(S,w,0,1,li,lj);

% x is just a heuristic, so we need to round it
>> [ma mb mi overlap weight] = mwmround(x,S,w,li,lj);
% [ma mb] give pairs of matched vertices
% mi is the binary matching indicator for L (as li, lj)
% overlap and weight are the overlap and weight objectives

If you want to get more information about how to use the tools, see the matlab/demo.m script.

## Algorithms

All of the algorithms and codes are contained in the matlab directory.

AlgorithmCodeSource
IsoRankfull_isorank.mSingh et al. 2007
SpaIsoRankisorank.m
NetAlginBPnetalignbp.m
Exact enumerationnetalign_exact.m
NetAlignMRnetalignmr.mKlau 2009
LPnetalign_lp_prob.mKlau 2009
Lagrangean LPnetalign_llp.mKlau 2009

## Data

DatasetFileSource
lcsh2wiki-smallprivate_data/lcsh2wiki-small.mat
lcsh2wiki-fullprivate_data/lcsh2wiki_full.mat
dmela-scere[ available with IsoRank, see below ]Singh et al. 2008
musm-homo[ available with Natalie, see below ]Klau 2009

#### lcsh2wiki

These data sets only come with S, w, li, and lj. The original datasets are considerably larger (hundreds of megabytes). Please contact us if you want them. None of the experiments require them.

#### dmela-scere

These datasets are distributed with IsoRank in the file http://groups.csail.mit.edu/cb/mna/packages/multiway_kpartite.tgz .

Use the python program experiments/bioinfo/convert_isorank_data.py to generate dmela-scere.smat dmela.smat and scere.smat which are L, A, and B, respectively. The program readSMAT.m will load these files into Matlab.

See http://groups.csail.mit.edu/cb/mna/ http://groups.csail.mit.edu/cb/mna/ for more about IsoRank.

After converting the data to smat with the above script, execute

A = readSMAT('dmela.smat');
[S,w,li,lj] = netalign_setup(A,B,L);
save '../../data/dmela-scere.mat' A B L S w li lj

to save the data for use with the experiments.

#### musm-homo

These datasets are distributed with Natalie in the file https://www.mi.fu-berlin.de/wiki/pub/LiSA/Natalie/natalie-0.9.tgz .

Use the perl script experiments/bioinfo/parse_natalie.pl in the to extract the files and then load_natalie.m to read the data in Matlab.

In the experiments/bioinfo directory, execute

load_natalie
[S,w,li,lj] = netalign_setup(A,B,L);
save '../../data/natalie_graphs.mat' A B L S w li lj

to save the data for use with the experiments.

## Experiments

• The synthetic experiments require the gaimc package : gaimc, but it is distributed in the experiments/misc/gaimc directory
• For linear programming, we used Clp and a matlab interface, see Clp
• Before running the power law experiments, please compile RandomPowerLaw.cpp to a.out (g++ RandomPowerLaw.cpp ought to work)
ExperimentDescriptionFigure
misc/figure_2.mGenerate figure 2Figure 2
bioinfo/bioinfo.mPlaying with musm-homo and dmela-scere datasets
evaluation/evaluate_all_problems.mGenerate results for figures 7 and 8 and table 4, plot with evaluation_plots.m and evaluation_table.mFigures 7, 8, Table 4
grid_test/grid_test.mSynthetic grid experiments, plot with plot_grid_test.mFigure 6
lcsh2rameau/rameau2lcsh.mRameau and LCSH experiments, make table with rameau2lcsh_table.mTable 4
powerlaw/powerlaw.mSynthetic power law experiments, plot with plot_powerlaw.mFigure 6
rounding/round_test.mVariations on rounding schemes

## References

• IsoRank http://isorank.csail.mit.edu
• Natalie https://www.mi.fu-berlin.de/w/LiSA/Natalie