**Begin now**. We are estimating this will take
2 weeks, so if you dont start now, you may have to work during the Spring break. Don't expect much response from
the instructors in the last eight hours before it is due, either.

*Late Policy:*Late work will be penalized 10% per day (24
hour period). This penalty will apply except in case of documented
emergency (e.g., medical emergency), or by prior arrangement.

In this assignment, you will develop different algorithms to make recommendations for movies. You are free to choose any programming language that you like such as C/C++ or Java. However, Matlab is highly recommended. Please check the following tutorial for more detailed information of matlab: http://www.math.mtu.edu/~msgocken/intro/intro.html. You can access Matlab from the computers in the lab by: /p/matlab/bin/matlab

The training data: a set of movie ratings by 200 users (userid: 1-200) on 1000 movies (movieid: 1-1000). The data is stored in a 200 row x 1000 column table. Each row represents one user. Each column represents one movie. A rating is a value in the range of 1 to 5, where 1 is "least favored" and 5 is "most favored". Please NOTE that a value of 0 means that the user has not explicitly rated the movie.

Please download the training data here: train.txt.

For more detailed information you can refer to Breese J. S., Heckerman D., Kadie C. (1998). Empirical Analysis of Predictive Algorithms for Collaborative Filtering. (pdf)

For more detailed information you can refer to Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. (2001) Item-Based Collaborative Filtering Recommendation Algorithms (pdf)

You can try different extensions of the memory-based or model-based algorithm (e.g., algorithms in the above paper).

You can also try different model-based methods. Some references can be found:Hofmann, T., & Puzicha, J. (1999). Latent Class Models for Collaborative Filtering. In the Proceedings of International Joint Conference on Artificial Intelligence. (pdf)

Pennock, D. M., Horvitz, E., Lawrence, S., & Giles, C. L. (2000). Collaborative Filtering by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach. In the Proceeding of the Sixteenth Conference on Uncertainty in Artificial Intelligence. (pdf)

Si, L. & Jin. R. (2003). Flexible mixture model for collaborative filtering. In the Proceeding of the International Conference of Machine Learning. (pdf)

Please provide the following information

The accuracy of the algorithms; Do you think the values are reasonable? How can you justify the results by analyzing the advantages and disadvantages of the algorithms. How long each algorithm takes to complete the prediction? Discuss the efficiency of the algorithms.You will need to turn in your code and report on your evaluation (2-4 pages.)

SSH to a Purdue CS machine and run the following command to turn in your project.

turnin -v -c cs547 -p collab name_of_directory

where name_of_directory

is the directory that you want to submit.