Overview
- Take homework 2 (GCA), and build and test it on a larger scale. Explore how increasing the number of shards and servers impacts performance. Possibly compare it with a map-reduce implementation and explore differences in performance.
Web 2.0
- Implement adaptive tracing in Mace. The idea would be to take the concept of AjaxScope to allow shared logging load which could be pieced together to understand the performance of a distributed system. Drill-down logging may not be applicable, but distributed logging should be.
- Implement MashupOS features for Firefox. Honestly, that's almost certainly too much work, so you would want to take a piece of it, and figure out how to incorporate it.
- Use the Swift toolkit to build systems. Evaluate their performance, and explore the generated code for exploitable problems, or to consider how much cruft may be added to the code. Ask yourself, is this similar to code which would have been written in a careful optimization?
- The Google Web Toolkit is similarly available. I'm not sure what specific project suggestions to make, but you might could imagine projects based on it.
Data Processing
- Perform an in-depth evaluation of map-reduce (a la hadoop). In particular, consider the question of how map-reduce scales as resources are added
- Implement map-reduce in Mace, and compare performance to hadoop.
Programming models for distributed systems
- Implement "continuations" in Mace -- i.e. generic continuation objects passed into downcalls with matching upcalls. The goal of this project would be to support a more thread-like programming style within Mace services.
- Implement a high-performance distributed system in Haskell, and compare its performance to a more traditional systems implementation.
- Implement and evaluate a version of the chubby lock service, and consider its strategies for dealing with failing nodes. Then consider how you would implement a service for fine-grained locking. Do you reach the same conclusions?
- Model check your implementation to homework 2. What techniques need to be used to provide deep analysis of your implementation to ensure properties of liveness and safety?
- Build the GCA (or any distributed system) using SEDA and/or Capriccio. Compare and contrast (and evaluate) the differences in programming techniques, performance, ease of use, etc.
Updated: August 24, 2008
Copyright 2008, Charles Killian