Project 2: Classification

Start date 20 February, due 5 March beginning of class.

Your task for this project is to extend the ID3 classifier (provided in the Weka package) to support postpruning.

Use the UCI Machine Learning Repository Iris and Adult dataset for this tasks. You are welcome to try on other datasets, but the results you turn in should be based on these datasets.

Project Report

The project report should contain the following:

  1. Description of the method used (e.g., cost-based pruning).
  2. Documentation for how to use your class (should probably inherit from weka.classifiers.trees.Id3).
  3. Sample run and results.
  4. Commentary: Does it work well (e.g., accuracy, efficiency)? What do you think are the advantages/disadvantages? If you were to do it again, what would you do differently?

Also turn in your code (obviously.)

Scoring

Scoring will be based on:

Turning in the project

Electronic submission preferred. Please use the turnin command (on mentor.ics.purdue.edu, turnin -c cs490d -p proj2 directoryname). If that doesn't work, you can tar/zip and email to clifton_nospam@cs_nojunk.purdue.edu. Pdf is the safest for capturing non-text. Hard copy is acceptable, please hand in at the beginning of class.


Valid XHTML 1.1!