CS 44800: Introduction to Relational Database Management Systems

Assignment 1: Relational Model and Algebra

Due 11:59pmEDT Tuesday, 7 September 2021

Update 9/2 12:23 - 1: fixed attribute names in relations.

Please turn in a PDF through Gradescope. You'll need to access Gradescope through Brightspace the first time (to register you for the course in Gradescope.) Gradescope is pretty self-explanatory, ITaP provides these instructions. Make sure that you mark the start/end of each question in Gradescope. Assignments will be graded based on what you mark as the start/end of each question. Please typeset your answers. (You won't be doing handwritten responses for a Google Project work request, why would you do so for class?)

Note: Exercises with book numbers are adapted from Silberschatz, Korth & Sudarshan; while you are encouraged to look there for further information (and may need to for some details) please answer the question as asked below (they may not be exactly the same.)

Question 1: Query Execution

Given the following relational database instance:

Courses

CourseID InstructorID Name
CS448 7 Introduction to Relational Database Systems
CS580 14 Algorithm Design, Analysis, And Implementation

Enrollment

Student_id CourseID Grade SectionNum GroupID
1 CS448 A 2 3
4 CS448 A 1 2
5 CS448 B 1 1
6 CS448 A 1 1
9 CS448 B 2 3
10 CS448 A 2 4
11 CS448 C 2 4
12 CS448 A 2 3
13 CS448 A 1 1
2 CS580 A 1 1
3 CS580 A 1 1
4 CS580 B 1 2
6 CS580 A 1 2
8 CS580 A 1 2
10 CS580 A 1 1
12 CS580 B 2 3
15 CS580 A 2 3

People

ID Name Nationality
1 Jon Snow USA
2 Jame Bond UK
3 Winston Churchill UK
4 Luke Skywalker USA
5 Jackie Chan China
6 Richard White USA
7 Bruce Lee USA
8 Hugo Lafayette France
9 Ben Kenobi USA
10 Harry Porter UK
11 Son Goku Japan
12 Wonder Woman UK
13 Sun Tzu China
14 Tony Stark USA
15 Leia Organa USA

For each of the following queries, show the output when run on the above tables. Some may not be legal queries on the above relations, if not, explain why.

In addition to giving the results, provide an English description, SQL, and relational algebra (whichever two are not given) for each query.

  1. SELECT P.Name, C.Name, E.Num
    FROM People as P, Courses as C, (SELECT CourseID, count(*) as Num FROM Enrollment GROUP BY CourseID) as E
    WHERE P.ID in (select C.InstructorID ) and C.InstructorID=P.ID and C.ID=E.CourseID;
  2. For each instructor, list the instructors’ name, ID and nationality.
  3. ΠCourseID,Grade σStudent_id=5 ( Enrollment ⋈ Courses )
  4. ΠCourseID,Grade σStudent_id=5 ( Enrollment ) ⋈ Courses
  5. ΠCourseID,Grade σStudent_id=5 ( Enrollment × Courses )
  6. List each country’s name with the number of students from that country.
  7. SELECT Enrollment.GroupID, count(*)
    FROM Enrollment
    WHERE Enrollment.CourseID = ‘CS448’
    GROUP BY Enrollment.GroupID having count(*)>2;
  8. The names of all people who are taking a course with Richard White. Hint: You will likely need to use the rename operator ρ for the relational algebra version.

While you are encouraged to put these tables into a relational database (e.g., sqlite or mysql) and run the queries to check your answers, you should be able to figure figure out the anwers on your own. You may see similar questions on an exam (although with smaller tables), and you will not have the ability to use a machine to generate answers.

For Questions 2 through 6, use the following relations (schema):

Student
(sid: integer, sname: string)
Course
(cid: string, iid: integer, cname: string)
Instructor
(iid: integer, iname: string)
Grades
(sid: integer, cid: string, grade: string)

Question 2: Relational Algebra

Write relational algebra for the following queries. If you need to make any particular assumptions, please list them.

  1. Find the name of the students who have registered in the course with cid CS44800.
  2. Find the ids of the courses taught by at least two different instructors.
  3. Find the ids of the students who never received a grade F.
  4. Give a list of all people (Instroctors and Students), with their ID and name.

Question 3: Relational Model Attribute Domains

In the above schema, we list data types (domain) for each attribute. Do you need to know the domain to write the queries in the preceding question? Explain why or why not.

Question 4: Union Semantics

Union (X⋃Y) requires that the left (X) and right (Y) arguments have the same attributes with the same domains.

  1. Course ⋃ Grades is not legal relational algebra, briefly explain why.
  2. Is there a way that I could do a union that includes some of what is in Course and some of what is in Grades? Would it make sense? (Note: You may want to look back at 2.4 and see if you are making any assumptions that you may not have stated.)

Question 5: Relational Model Keys

Each of the relations above has a primary key (underlined). For Course and Grades, the primary key consists of two attributes.

  1. For the Student and Course relations, give another key, or explain why there is no other key.
  2. For the Instructor and Grades relations, give another candidate key, or explain why there is no other candidate key.
  3. If you did not know the keys, or the keys were different, would your answers to Question 2 have changed? Explain why or why not.

Question 6: Θ-Join

Suppose we wanted to find courses that do not have an instructor. Would the query Course ⋈Course.iid≠Instructor.iid Instructor do what we want? Explain why or why not.

Question 7: Null Values

Why do we need null values in a relational database? Create an example use case where nulls are appropriate, and a second example where there are nulls, but there would be a better way to set up the database so you didn't need to have nulls.

Question 8: Relational Operations

Why is intersection (⋂) not a basic relational operator? Explain with an example.