CS 603 Project Topics

CS 603 Project Topics (update in progress)

Following is a list of possible project topics for CS 603 (Spring 1997). The list is in no particular order and it is not meant to be exhaustive; it provides only a guideline. Independent project ideas are encouraged. The references listed are starting points and further references are available after focusing on a specific topic.

Distributed O.S.

Design and implement user-level distributed O.S. functionality under UNIX. Subtopics include load sharing, distributed file system, distributed software shared memory, single system image user interface (shell), etc.
Extend the Xinu O.S. to support various cooperative resource sharing/management tasks. Steps toward a single-system image, process migration and load balancing are possible subtopics. One basic task is to provide a dynamic run-time environment (or design a version of Xinu which does not lump together O.S. and applications into a single executable image) which supports loading of processes, protection, etc.
- D. Comer. Operating system design: the XINU approach.

Load Balancing

Design, implement, and compare user level load balancing algorithms on the Xinu cluster. In the absence of process migration, one may study load placement strategies (i.e., once a process has been allocated to a particular host, it cannot be moved) using actual applications. How to measure load, what processes can be remotely executed are some of the subquestions that need to be addressed.
Design, implement, and compare user level load balancing algorithms on the Xinu cluster, however, using emulation of application behavior in place of executing actual applications. Process migration, thus, can also be emulated since application processes are imaginary. This allows for the a more intimate control of the test environment, however, one needs to be careful in modeling application behavior.
Model and analyze load balancing algorithms. In particular, use recent convergence analysis techniques (e.g., estimation of conductance and its use in rapidly mixing Markov chains) to analyze the convergence rate of load balancing strategies. Another approach is to use on-line analysis using the competitive ratio to reason about the behavior of various algorithms under different model assumptions.
- Azar et al. On-line load balancing. IEEE FOCS '92, pp. 218-225.
- M. Mihail. Conductance and convergence of Markov chains: a combinatorial treatment of expanders. IEEE FOCS '89, pp. 526-531.
- Phillips et al. Online load balancing and network flow. ACM STOC '93, pp. 402-411.
Confer also noncooperative method section.

Noncooperative Distributed Resource Management

Use a selfish or greedy application model to design, implement, and analyze distributed resource allocation problems. For example, how to perform distributed scheduling and load balancing in a heterogenous application environment where certain applications have higher intrinsic priorities than others. Use of pricing, competitive arbitration, and construction of computational markets is one approach. The overall goal is to induce a desirable global system state in spite of the greedy/selfish behavior of individual applications. These ideas can be extended to any form of distributed resource.

Cocchi et al. Pricing in computer networks: motivation, formulation, and example. IEEE/ACM Trans. Networking, vol. 1, no. 6, pp. 614-627, 1993.

Ferguson et al. Microeconomic algorithms for load balancing in distributed computer systems. IEEE ICDCS '88, pp. 491-499.

Korilis et al. The designer's perspective to noncooperative networks. IEEE INFOCOM '95, pp. 562-570.

K. Park. Self-organized multi-class QoS provision for ABR traffic in ATM networks. IEEE IPCCC '96, pp. 446-453.
Waldspurger et al. Spawn: a distributed computational economy. IEEE Trans. Software Eng., vol. 18, no. 2, pp. 103-117, 1992.

Fault-Tolerance Issues

Extend various concurrency control and distributed resource contention/consensus algorithms to operate correctly under Byzantine (i.e., malicious) faults. A general strategy for centralized algorithms has been discussed and is a suitable candidate for implementation and testing.
Apply hierarchical fault-tolerance methods to distributed application domains such as distributed name service (and other server selection problems), caching, and collaborative environments to tolerance k-sparse faults. Of utmost importance is the identification of application domain features which justify the use of the hierarchical method over a myriad of engineering solutions. After selecting a suitable domain, the remainder of the task will involve modeling and algorithm design.