CS448: Introduction to Relational Database Systems Implementation


Spring 2006

Time and Place: 
                    MWF 9:30-10:20am  
                    Room: CS-G066

Instructor:
                   Walid G. Aref
                   Office: CS-124
                   Office Hours: By email appointments

                   Phone: Ext. 41997
                   E-mail: aref@cs.purdue.edu

 

Teaching Assistant:

                   Chris Mayfield
   
                Office Hours: By email appointments

                   E-mail: cmayfiel@cs.purdue.edu

PSOs:   
        Tuesday, 3:30 PM - 5:20 PM, REC 108
        Friday, 1:30 PM - 3:20 PM, REC 108

Course Description:
CS448 provides an introduction to relational database management system (DBMS) implementation (under-the-hood stuff). You will learn what it takes to build a database management system. There is a significant component of Java development. There will also be hands-on exercises using the Oracle DBMS.

The list of topics covered in this course includes:

1. Introduction to databases and systems: Concepts, architectures, and

fundamentals of the relational data model.

 

2. Review of relational algebra and SQL

 

3. Relational calculus

 

4. Disk storage, file organization, and indexing

 

5.  Tree-based Indexing, ISAM, B+-trees

 

6. Hash-based indexing, static and dynamic hashing, extendible hashing, linear hashing

 

7. External sorting techniques

 

8. Join processing techniques, block nested loops, indexed nested loops, sort merge join, hash join

 

9. Evaluation of other query operators, techniques for processing selects, projects, duplicate elimination, aggregate functions, group-by

 

10. Query evaluation pipelines, evaluation techniques, left-deep, right-deep, and bushy tree query evaluation pipelines

 

11. Query optimization, query operator cost estimation, selectivity estimation, plan enumeration, plans with interesting orders

 

12. Transaction management, ACID properties

 

13. Concurrency control techniques, two-phase locking protocol, deadlock detection and prevention, index/tree locking protocols, multiple-granularity locking, optimistic concurrency techniques

 

14. Recovery techniques, logging, WAL protocol, checkpointing, crash recovery.

 

15. Parallel and Distributed database systems, architectures, query processing techniques, replication, and concurrency control

 

16. Highlights of new directions

Most information regarding the course will be made available on
the web page:
http://www.cs.purdue.edu/homes/aref/Spring2006CS448/coursehome.html.
 

Text:
Required: Database Management Systems, (NOTE: Third
Edition), by Raghu Ramakrishnan and J. Gehrke. McGraw Hill, 2003, ISBN 0-07-246563-8.

References:

·       Oracle 9i: The Complete Reference, 10th Edition, by Kevin Loney & George Koch, Osborne McGraw-Hill, Inc., 2002, ISBN: 0-07-222521-1, or

·       ORACLE 9i Programming: A Primer, Rajshekhar Sunderraman, 2003.

Workload:
There will be around five project assignments. In each assignment, you will be asked to implement one component of a simple database management system. There will be several SQL programming assignments using Oracle. There will be around four homeworks, a midterm, and a final examinations. The final exam will be cumulative, but the emphasis will be on the material covered after the midterm examination.

The Project:
The projects are an important part of the course, and will involve a significant amount of Java programming. The first projects will be SQL programming assignments using Oracle. The remaining projects will be performed in teams of two. The purpose is for each team to build parts of a working single-user relational database management system. You will start almost from scratch -- a few basic components may be provided to you. By the end of the course, you will have built a simple DBMS.

In most of the assignments, you will be given Java class definitions with templates. You will need to actually implement the functions. Implementing the various interfaces involves several hundred lines of code. At the end of the project, you will understand the basic relational DBMS concepts because you have implemented them. Each of your assignments builds upon the code written in the previous assignments.

Unless extremely necessary, you should not change project partners for project assignments. You need to have the instructor’s prior approval to switch partners.

Grading:
The final grade will be based upon the following:
Programming and Homework Assignments 50%
Mid-term Exam 20%
Final 30%
(Extra-credit points will be given in class)

 

Course Policy:
Late submission will result in a 10% penalty for each day late. After three days, late submissions will not be accepted.

Students are strongly advised that any act of cheating will
result in a score of 0 for the entire assignment and repeat
offences will be reported to the Office of the Dean of Students
and will result in an automatic F grade. You are encouraged
to discuss problems and ideas but the final solution or code must be
your own.