CS240 Lab08
Movie Rating Statistics
Overview:
In this lab, you'll continue practicing 2D arrays, pointers, malloc, etc.. And you'll touch a little about reading from file . We'd like you to write a small program to process a raw data file from a movie rating website. The processing result should be shown in tables and printed out to the screen.
Lab Exercise:
A movie rating website has saved their data in a file containing movie rating information from its registered users. Of course, to develop this program, we can use a sample data wich contains only 20 ratings. You can download the data here: data.txt. In this file, each row is a rating of a movie by a user. The first number in a row is the user ID, the second number the movie ID, and the last number the rating which takes the value from 1 to 5. For example, the first line of the data is:
1 2 4
which means user 1 rated movie 2 as having the score of 4.Now the website asks us to write a program to show the ratings in a table. Specifically, each line of the table shows the ratings of all movies by a user. If the user didn't rate a movie, the score should be shown as 0. And each column of the table shows the ratings of a movie by all users. The following table shows the desired result of the given data file:
The Movie Rating Table:
| U\M | 1 | 2 | 3 | 4 | 5 | 6 |
| 1 | 0 | 4 | 0 | 0 | 0 | 0 |
| 2 | 0 | 5 | 0 | 3 | 0 | 0 |
| 3 | 0 | 0 | 3 | 0 | 0 | 3 |
| 4 | 0 | 5 | 0 | 4 | 0 | 0 |
| 5 | 0 | 1 | 4 | 4 | 2 | 0 |
| 6 | 2 | 3 | 0 | 0 | 0 | 2 |
| 7 | 5 | 0 | 5 | 3 | 2 | 2 |
| 8 | 0 | 0 | 0 | 4 | 0 | 0 |
In addition, they also ask us to do some statistics for them. Namely, they want to know the number of ratings and the average score of each movie. When calculating the average score, we should ignore all 0 scores (which denotes a movie was not rated by a user). The statistical result should also be shown in tables. Here is the desired result of the given data set:
The Movie Rating Count:
| Movie | 1 | 2 | 3 | 4 | 5 | 6 |
| Count | 2 | 5 | 3 | 5 | 2 | 3 |
The Movie Rating Score Stat.:
| Movie | 1 | 2 | 3 | 4 | 5 | 6 |
| Score | 3.5 | 3.6 | 4.0 | 3.6 | 2.0 | 2.3 |
Finally, we have an additional requirement: you cannot use the array notation [] when you use arrays (i.e. you can only operate on pointers). This is the same requirement in the last lab.
Tips and Hints:
1. In this lab, you need to read the ratings from a file. So you can use the function fscanf(fp,"%d" ...) to read the IDs and ratings. This code is provided to you in the skeleton code.
2. You can use a 2D array to represent the rating table. Each line represents the ratings of all movies by a user, and each column represent the ratings of a movie by all users (just the same as the result table itself). But the size of the table is not provided by the data file. So you may first get this information from the file by yourself. To do so, you may use arrays to store the user IDs and movie IDs separately, and find the maximum numbers of them to be the size of the rating table. You can assume that you will be given contiguous user IDs and movie IDs
3. Since you cannot use [] in arrays, you may first define a pointer equal to NULL and then use the function malloc() to allocate proper amount of memmory for it. And to access to the element of the array, you may use, in 1D array for instance, *(pointer + i).
4. Dont' forget to free the memory you have allocated at the end of your program.
When you are satisfied that your program works correctly (or you run out of time), please do the following.
turnin -c cs240=XXXX -p lab08 lab08
where XXXX represents your lab section number.
The turnin section is as follows:
| Section | Time | TA |
| 0201 | Thursday 15:30-17:20 | Dan Zhang |
| 0301 | Friday 09:30-11:20 | Suli Xi |
| 0401 | Friday 13:30-15:20 | Youhan Fang |
| 0501 | Thursday 09:30-11:20 | J. C. Chin |
Make sure turnin reports that your project was submitted for grading. You can check the files you have submitted by running the following command:
turnin -c cs240=XXXX -v -p lab08
| 1 points | A working Makefile is provided. |
| 7 points | The rating table is correct. |
| 3 points | The number of ratings is calculated correctly. |
| 3 points | The average scores is calculated correctly. |
| 3 points | Display the result correctly and clearly. |
| 3 points | Free the memory correctly. |