Projects for CS590D

Themes for Projects: These are just a sampler. You can work in groups of 2-3, depending on the complexity/magnitude of the project.
  1. Enhancing the expressiveness of query languages by embedding "mining" functions in SQL etc.

    This would entail, adding more user defined functions/operators that will call a data mining routine that is either written by you or available as a standard off-the-shelf component. Also involves performance evaluation of your implementation. Medium to Heavy Programming.

  2. Complete, case study of a particular application as a domain for data mining. This would involve feasibility analysis, data cleaning and deciding on the right tool/software to use, conducting data mining and reporting/visualization of the results. Light programming.

  3. Research into the design/improvement of a specific data mining algorithm or dwelving into parallelization of an existing algorithm. Would need to generate datasets for performance evaluation and conduct comparative results. Medium Programming.

  4. Any ideas for theoretical/analytic work that lies at the intersection of one or more sub areas/sub disciplines of data mining and its constituent topics.

  5. Integration of data mining systems into several kinds of "host" environments. Tight coupling of inductive functionality into existing applications. Preferably using a commercial RDBMS or OLAP tool. Medium to Heavy programming.

  6. "Closing the loop" - Data Mining is never a single step, rather an iterative process involving several modules. Novel ideas and suggestions for building a closed loop data mining system are also welcome. Medium programming. Would involve interfacing tools with one another.

