CS542 -- Project Titles


 

1. Implement Centralized Two Phase Locking (2PL)

You will implement the centralized 2PL (Global 2PL) approach to Concurrency Control and be able to detect/resolve deadlocks.

You may use Centralized Control for global decisions.

You will generate at least three sites; you will maintain a lock table and a wait-for (dependency graph) at one site!

This means that all transactions arriving at different sites are sent to the centralized site. Queues for requesting /releasing locks will be maintained at this site. Processing must take place at the site where transaction is submitted. Locks will be released after waiting on the database at Centralized site. Updates for other sites will be sent (we assume fully replicated database) and they may arrive in different order. Your program must ensure that all updates at all sites are posted in the same order.

You may implement the algorithm of section 11.3.1 on page 318 of our text book (Ozsu) or a variation of it.
 
 

2. Local vs Global Transactions in Optimistic CC

More details are available in the note I handed out in class and a paper Resilient Concurrency Control in Distributed Database Systems present in the Chapter 10 of Prof. Bhargava's book (on reserve in Math library).

You will implement the optimistic approach to concurrency control and maintain the datastructures.
 


You will do local and global validation for each transaction.

You will send to a site, a mix of local transactions (that require and update at only this site) and global transaction
(that require an update on all copes).

You will generate at least three sites and maintain a list of logical objects and directory of copies as in Site Recovery in Replicated Distributed Database Systems description (Chapter 11 of Prof. Bhargava's book. Its on reserve in Math library).

You may assume that no failures occur in the system.

One of the problems you will have to figure out is how to keep the list of committed transactions against which a global transaction must validate fixed at each site without delaying the local transaction.
 
 

3. Replicated Copy Control: Correct Read-From Precedence

Transactions are validated and committed in the concurrency control server, while the database on different sites is posted by the I/O server (or access manager). There is an interval between the time of commit of a transaction
Ti, and the time of update by it on the database. During this interval, a new transaction can read an old value.

For concurrency control, the new transaction is serialized after the commitment of Ti, while for accessing the database it is before Ti.

The research and implementation question is:

a ) How to eliminate this interval?,

and

b ) How to establish the correct read-from relation for each pair of transactions for all servers on a site?

You should design a policy (different than total blocking of the new transaction), and implement it.

You will need to have at least two servers for each site. You need to generate at least two sites.

You will need datastructures that contain:

a) transaction id and time stamp of commit,
b) transaction id and time stamp of reading an object,
c) transaction id and time stamp of writing an object,
d) read-from relation for each transaction w.r.t. all other transactions (conflict graph).

You will implement a protocol that will produce the same read-from relation for all transactions on each sites.

You may be able to measure the delays caused by your policy.
 
 

4. Adaptability

More information is available in the paper ``A Model for Adaptable Transaction Processing'', in IEEE Transaction on Knowledge & Data Engineering, December 1989 (Given as handout in the class).

You will set up a datastructure to capture the read/write set and time stamps for transactions that arrive on the site. This information will be used by three types of concurrency control;
 


You will output the serializable history as accepted by each of the above protocols. In addition you should be able to switch from one protocol to another. Certain conditions must be satisfied that are given in the technical report and need further study.

For distributed concurrency control you may want to use the decentralized protocol of Rosenkrantz, Stearn and Lewis (ACM Transactions on Database Systems, June 1988).

You will implement the

a) would-wait protocol,
b) wait-die protocol,

and should be able to switch among them.

If you like, you may choose to adapt from distributed control to centralized control protocol or vice-versa. (This can be done for concurrency control as well as commit protocols.)
(Group project is possible.)
 
 

5. Semantics

The objective of this project is to identify semantics useful for concurrency control and/or recovery. The semantics will be provided by the user as a part of each transaction (e.g., breakpoints suggested by Lynch in ACM Transactions in Database Systems, December 1983, types of locks, commutativity of operation, user defined dependencies among transactions, types of objects and operations, etc.).

The concurrency control protocol will use this semantics to serialize transactions and eliminate partial executions.

You will need to implement a concurrency control method and change the transaction generator to add the semantics.
 
 

6. Three Phase Commit Protocol

More details are available in the Ph.D. thesis of Dale Skeen Chapter 5, Section 3, on page 102 of his paper (available from Prof. Bhargava).

A mininum of three sites should be generated. Each site will contain datastructures:


Your program will initiate rounds of messages to terminate or commit a transaction and update the data structures. After two or three rounds, a decision will be made for the transaction. You will initiate transactions at each site. You will also simulate failures of different types.

You do not have to update the database.
 
 

7a. Site Failure in Partially Replicated Database

More details are available in the paper "Site Recovery in Replicated Distributed Database Systems", 6th IEEE Distributed Computing Systems Conference, May 1986 (also Chapter 11 of Prof. Bhargava's book on reserve in Math library).

You will implement the protocol ``Read one copy, write all available copies.''

A minimum of three sites should be generated. Each site will contain datastructures for:


Transactions will arrive at each site. A copy will be read from this site or another site that is up and contains a copy. All copies at up sites will be update. The up/down status is determined by session vector.

You will simulate the up/down status of sites. You may assume initially that at least one site is always up. You will use a control transaction to announce the failure. You will recover a site by updating the datastructures correctly,and will announce the recovery via another control transaction.

Initially you do not have to worry about recovering the database.

(Note that reading or writing of database will be simulated.
The datastructures will be updated in reality.)

7b. Network Partitions (Two Partitions)

Several protocols are given in Chapter 1 of Prof. Bhargava's book (on reserver in Math library) to deal with this problem.

You would need to implement an optimistic or a pessimistic approach.

For the pessimistic approach, you may read the technical report "Maintaining Mutual Consistency of Replicated Files in Partitioned Distributed Database System", which is available from Prof. Bhargava.

You will need to make several datastructures:

You will generate at least three sites. You will maintain a set of logical files and a directory of
physical copies (it could be the same as site name vector). You will simulate two partitions.

(Note: You will not do any actual reading or writing of files for any transaction.)

You will implement the following procedures:


These procedures will allow you to do all the necessary work to deal with transaction processing during network partitions.
 
 

8. Fail Locks and Database Recovery (Multiple Failures)

(See notes for site failure in partially replicated database.)

Instead of recovering the site, you will concentrate on ensuring that the database is fully recovered at the failed site. The operational sites will maintain fail-locks (for each failed site) on copies that are updated. The recovering site will collect fail-lock table from each site and mark the corresponding data-items as unavailable. It will allow processing of transactions on other data items. If any site is down, you will maintain the fail-lock table on the
recovered site if any other site is down.

The unavailable data-items will become available by one of the two approaches:


We may be able to measure by simulating the time it takes to make all unavailable copies up-to-date.