INDEXING and MAPPING of PROTEINS by SAMMON'S PROJECTION ALGORITHM

By: Izydor Apostol (Baxter Inc.), Wojciech Szpankowski

A modified Sammon algorithm is developed to display a relationship between proteins based on their amino acid composition. In the first stage of the method, a 19-dimensional compositional space of representative proteins was mapped into a 2-dimensional space (2-D) using the original Sammon projection creating a contour map. In the second stage, this contour map was used as a reference for new proteins projected into 2-D. Data analysis showed that proteins belonging to the same structural classes formed characteristic and distinct clusters, which could be potentially useful in the prediction of protein structural classes. However, we observed significant overlapping of the clusters which may explain the limited success of previous protein folding prediction based solely on amino acid composition. Regardless, the modified Sammon projections can generate a unique index for each individually projected protein related to its amino acid composition, which may be a useful tool in the exploratory classification of proteins. The method and results are describe din details in our paper: I. Apostol and W. Szpankowski, Indexing and Mapping of Proteins Using a Modified Nonlinear Sammon's Projection, (pdf format) Journal of Computational Chemistry, June 1999.

In addition, if one wants to run the program that we used to produce our results, click here to download sammon.zip file. You must unzip before using it.