- CRIS: (The
Computational Research Infrastructure for Science) is a system with its
primary tenets to provide an easy to use, scalable, and collaborative
scientific workflow and data management cyberinfrastructure (CI) for
scientists lacking extensive computational expertise. CRIS currently
has a community of users in Agronomy, Biochemistry, Bioinformatics and
Biology at Purdue University.
- Ionomics Atlas: Ionomics Atlas provides a Google Map based intuitive graphical user
interface that allows to access, analyze, interpret and find
correlations among the following three properties in Arabidopsis
thaliana plant population which are Ionomic information, Genetic
information and Environmental information. We
are currently adding statistical tools to the system in order for
biologists to have more control over tailored datasets of accessions.
- Bionet: Conducting research regarding an
Interactive Visualization and Data Mining (IVDM) platform, namely
JSysnet. We demonstrate how Bionet is used to interactively analyse
intermolecular correlations using various statistical methods and
perform interactive comparative and correlative analysis of molecular
expression data.
- Local Search Recommendation System:
Creating a recommendation system for local search based on item-set
similarity between businesses. The recommender relies on query-logs in
order to understand similar businesses.
-
Pervasive Computing and Social
Networking: Enhancing user level experience in pervasive smart spaces
by utilizing social networking information. The system aims to
capture and provide recommendations based on the proximity of users
that share the same interests.
-
Tag
Recommendation System for BibSonomy: Creating a K-Partite graph
technique for recommendation of tags during bookmarking of resources,
whether they are internet bookmarks or bibliographic entries that
represent the literature.
-
Spam
Detection in Social Networking: Using machine learning techniques to
identify spam in social bookmark systems such as “BibSonomy”. We
were motivated to identify the features that define spamming
bookmarks. The programming language used was Perl for parsing the
dataset files and for evaluation of the system.
-
An
Automated Method for Arabic Text Document Filtering: Automatic
filtering of documents by learning the user topic-document
associations. We use NLP and Information retrieval techniques to
create a feature vector that would represent a document. We then use
a classifier such as a Support Vector Machine to learn this model and
then provide topic judgments to unseen documents.
-
Pervasive Open Spaces: Open Spaces is a smart spaces pervasive environment that
harnesses the power of scalable systems in terms of available
resources. Users requesting resources in the Open Space will neither
be bound by their current location, nor their current cluster (dome).
-
Optimized
methodology for Arabic Cross Document Named Entity Normalization:
Utilizing a machine learning approach based on an SVM classifier
coupled with preprocessing rules for cross-document named entity
normalization. The process involves disambiguating different entities
with common name mentions and normalizing identical entities with
different name mentions.
-
Bionoculars:
Bionoculars is a system that automatically extracts interactions
between entities such as proteins, chemicals, and diseases from
biomedical text, namely biomedical journal articles and abstracts,
with preliminary focus on protein-protein interaction.
-
Machine
Assisted Translation: Machine assisted human translation system or
MAHT for short aims to assist the translator by providing him with
suggestions for auto-completion relating to the document he is
translating. MAHT uses machine translation capabilities and
probabilistic models based on the document at hand in order to
provide accurate translations to the user.
-
Information
Retrieval on Genomics Data: Classifying documents containing experimental evidence
allowing assignment of Gene Ontology Codes. This work is part of TREC
2004 Genomics workshop. All scripts for parsing the dataset was
developed using Perl.