Implementation of Privacy-Preserving Data Mining Protocols

Purdue has an NSF-funded project on Privacy-Preserving Distributed Data Mining. The project has produced numerous algorithms for generating data mining results without disclosing the data used, however few have been implemented. Implementation turns up numerous new challenges, provides an opportunity to validate the theoretical work, learn the tradeoffs between various cryptographic techniques, and experiment with performance. We have an implementation of one privacy-preserving distributed data mining algorithm as an extension to Weka, a public domain data mining toolkit. Using this implementation as a guide, and the published papers on other algorithms, you will refine the details as you implement and evaluate privacy-preserving data mining algorithms.

In performing this project, you will learn details of data mining algorithms, tools and techniques for cryptography, and develop skills in programming distributed applications.

This is a paid research position, for undergraduates only. Funding is from a National Science Foundation Research Experiences for Undergraduates supplemental award.

Background

Performing this project will require understanding of data mining algorithms, cryptography, and Java programming skills. While few students will have background in all of these, some background is expected. For example, CS426, CS490D, and CS 381 would be perfect, but two of the three, reasonable programming skills, and a desire to learn should be sufficient.

Expectations

What you will produce will include:

Process

It is expected that you will have at least weekly meetings with one of the personnel on the project.

If interested, please contact Professor Chris Clifton or Mikhail Atallah.

Valid XHTML 1.1!