CS240 Lab08

Movie Rating Statistics

Overview:

In this lab, you'll continue practicing 2D arrays, pointers, malloc, etc.. And you'll touch a little about reading from file . We'd like you to write a small program to process a raw data file from a movie rating website. The processing result should be shown in tables and printed out to the screen.

Lab Exercise:

A movie rating website has saved their data in a file containing movie rating information from its registered users. Of course, to develop this program, we can use a sample data wich contains only 20 ratings. You can download the data here: data.txt. In this file, each row is a rating of a movie by a user. The first number in a row is the user ID, the second number the movie ID, and the last number the rating which takes the value from 1 to 5. For example, the first line of the data is:

1 2 4

which means user 1 rated movie 2 as having the score of 4.

Now the website asks us to write a program to show the ratings in a table. Specifically, each line of the table shows the ratings of all movies by a user. If the user didn't rate a movie, the score should be shown as 0. And each column of the table shows the ratings of a movie by all users. The following table shows the desired result of the given data file:

The Movie Rating Table:
U\M 1 2 3 4 5 6
1 0 4 0 0 0 0
2 0 5 0 3 0 0
3 0 0 3 0 0 3
4 0 5 0 4 0 0
5 0 1 4 4 2 0
6 2 3 0 0 0 2
7 5 0 5 3 2 2
8 0 0 0 4 0 0

The first row shows the movie IDs and the first column shows the user IDs.

In addition, they also ask us to do some statistics for them. Namely, they want to know the number of ratings and the average score of each movie. When calculating the average score, we should ignore all 0 scores (which denotes a movie was not rated by a user). The statistical result should also be shown in tables. Here is the desired result of the given data set:

The Movie Rating Count:

Movie 1 2 3 4 5 6
Count 2 5 3 5 2 3

The Movie Rating Score Stat.:

Movie 1 2 3 4 5 6
Score 3.5 3.6 4.0 3.6 2.0 2.3
Note that the precision of the average score is 1 digit in the fraction part. You can see the entire desired output here.

Finally, we have an additional requirement: you cannot use the array notation [] when you use arrays (i.e. you can only operate on pointers). This is the same requirement in the last lab.

Tips and Hints:

1. In this lab, you need to read the ratings from a file. So you can use the function fscanf(fp,"%d" ...) to read the IDs and ratings. This code is provided to you in the skeleton code.

2. You can use a 2D array to represent the rating table. Each line represents the ratings of all movies by a user, and each column represent the ratings of a movie by all users (just the same as the result table itself). But the size of the table is not provided by the data file. So you may first get this information from the file by yourself. To do so, you may use arrays to store the user IDs and movie IDs separately, and find the maximum numbers of them to be the size of the rating table. You can assume that you will be given contiguous user IDs and movie IDs

3. Since you cannot use [] in arrays, you may first define a pointer equal to NULL and then use the function malloc() to allocate proper amount of memmory for it. And to access to the element of the array, you may use, in 1D array for instance, *(pointer + i).

4. Dont' forget to free the memory you have allocated at the end of your program.

Turnin:

When you are satisfied that your program works correctly (or you run out of time), please do the following.

  1. Create a directory named lab08
  2. Copy all program files (*.c and *.h) and your Makefile to the directory created in step 1.
  3. Goto the parent directory of lab08 and type the following command into the terminal:
    turnin -c cs240=XXXX -p lab08 lab08

    where XXXX represents your lab section number.

    The turnin section is as follows:

    Section Time TA
    0201 Thursday 15:30-17:20 Dan Zhang
    0301 Friday 09:30-11:20 Suli Xi
    0401 Friday 13:30-15:20 Youhan Fang
    0501 Thursday 09:30-11:20 J. C. Chin

    Make sure turnin reports that your project was submitted for grading. You can check the files you have submitted by running the following command:

    turnin -c cs240=XXXX -v -p lab08
  4. Submit often to make sure that you did not try to submit after turnin has been turned off.

Grading Breakdown:

1 points A working Makefile is provided.
7 points The rating table is correct.
3 points The number of ratings is calculated correctly.
3 points The average scores is calculated correctly.
3 points Display the result correctly and clearly.
3 points Free the memory correctly.