Project Information

Class Project Requirements and Deadlines

Team Formation and Project Idea (1 page max, hardcopy submitted in class)

The names of project team members (2-3 persons per team).
The project title (either from the list below or, if you propose a new project, similar in style).
A brief description of the project (either from the list below or, if you propose a new project, similar in style).

Project Proposal (2-3 pages max, hardcopy submitted in class))

Project title
A paragraph (or bullet points) on what you will do to carry out the project: (e.g., we will implement a distributed data collection backend for the mobile phone data collection topic from the Project list; our backend will have the following characteristics: XXX, YYY, ZZZ). You should list the 2 or 3 hardest challenges you expect to face (see end of this page) and tell us which book chapters/lectures you hope to draw on in developing your solution. You should also accompany these actions with an estimated date of completion (time schedule for the project). You can evolve this plan later if needed; the one you file is an initial concept.
A paragraph (or bullet points again) explaining how you will demonstrate the project (e.g., on completion, we will have a visual demo and a poster. The demo will show....) This can evolve over time too.
In your team, who will do what? How often will you meet? How many hours per week will you work on the effort? How many class hours (and/or job hours) are you undertaking this semester? Can you really spend that number of hours per week? If you plan to have a team larger than two, it is most important you know exactly who will do what and also who is in charge of the overall effort.

Interim Report (2-3 pages max, hardcopy submitted in class)

A paragraph (or bullet points) on what you are doing to carry out the project.
Describe in bullet form: 1) what has been achieved thus far, 2) what is in progress and how it is going and 3) what is not done at all
You should submit the old proposal alongside this interim report and in the interim report justify any changes made to the previous proposal.

Final Report (10 pages maximum, hardcopy submitted in-person to instructor during demo)

A paragraph (or bullet points) on what activities you set out for this project.
Describe in bullet form: 1) what activities were achieved 2) what was left unfinished.
One or more pictures with text explaining the architecture of your system and how it works.
Evaluation section: describe what you did to evaluate your system (e.g., we built a tool to trigger many client requests to test the scalability, or we built a script to trigger failures to test the fault tolerance of our system, etc.) and provide any results from experiments you performed.
A demo that shows the full capabilities of the system. During the demo you'll want to explain how you departed from the original plan if what you end up doing isn't the same.

Suggested Project List : You can certainly suggest a project of your own but it should be similar in spirit to the ones on the list and you should tell us which chapters of the textbook you hope to draw on in developing your solution. Class projects must not come from completely different courses or areas but you are welcome to bring ideas from other research areas into your class project.

Build a high-performance Raft implementation, changing the design as needed. For example, batching, handling commutative operations in parallel, dealing with a node that fails and comes back online after awhile and has to come up to speed, etc.
Design and build a Byzantine-fault tolerant Raft implementation.
Build a DAPP: Use a blockchain to build something other than a crypto-currency.
Smartphone-based social network : Develop a privacy-preserving smartphone based social network. Look at the Musubi paper out of Monica Lam's group at Stanford for inspiration.
E-voting system backend: Develop a distributed vote collection backend for an Internet voting system. Users log in throughout election duration (e.g., 24 hours) and cast votes. Challenges include: Each vote must be counted once and only once. Vote data should be replicated for safety/posterity. Vote collection replica machines must agree on all votes cast, avoid duplicate or inconsistent notes, as well as when the election terminates.
Mobile Phone Data Collection : Mobile phones today have many built-in sensors providing scientists the opportunity to collect rich data generated by these sensors to study human behavior and lifestyle patterns. Project is to develop a global sensor data collection system to collect large amounts of sensor data from mobile phones and enable scientists to query and visualize this data. Challenges include: replicate data for availability and durability, figure out kinds of sensor data you would like to collect, and how to deal with intermittent connectivity of mobile phones.
Design and implement a replicated bulletin board implemented across N nodes. Users can read postings on the bulletin board and write to the board. Issues to consider: can users read/write to any bulletin node? will it guarantee some kind of ordering of messages? [Guaranteeing ordering of messages makes this harder.]
Design and implement a simplified distributed banking system.
Design and implement a MUD game (players "roaming" in a "landscape" -- please no killing/blood) using multiple servers: replicated and/or partitioned game space
Design and implement a file distribution service: Features might include: read-only, peer-to-peer?, lookup protocol and parallelized I/O
Design and implement a stateless file system: file server, client-side library, caching mechanisms.
Design and implement a secure "anonymous" storage service: key management, replicated and fragmented data (Do a Google search for Freenet and FreeHaven to get an idea about what these systems try to do. These are amongst the first systems trying to provide this service.)
Design and implement a server cluster for high availability (that is, focus on the "seamless" failover issue) [in the spirit of "build-an-amazon"]
Design and implement a server cluster for performance (that is, focus on load balancing) [In the spirit of "build-an-amazon"]
Online health monitoring system : Assume we have medical monitors installed in patients' homes and we want data from the monitors to be sent to the doctor's office and stored for longterm querying and use. Challenges include: system is real-time -- may need to get data sent to doctor's office and seen by doctor quickly so that response is generated on the fly and sent back as soon as possible.
Design and implement a distributed robust course registration system for the department's PMS.
Design and implement a uoa (distributed) social network for course comparison.
Re-implement one of the systems described in the papers discussed in class. For example, implement the Chord Distributed Hash Table routing algorithm. Features: Basic routing lookup, what happens when a node fails, replicate keys for higher availability or fault tolerance, etc. [Straightforward in the sense that you don't need to address the classic DS issues listed below from scratch as in other projects where you yourself are designing the system. Emphasis would be on evaluation and measurements here.]
Implement and measure a distributed algorithm of your choice, described in the book or in the literature, that we have not discussed in class.

More than one team can independently undertake the same project.

In your project, you should aim to tackle two or more distributed system issues/challenges we have discussed in class. DS issues include:

naming (how you find data/objects/servers/etc.)
communication (how will the components communicate amongst each other? RPC, raw sockets?)
replication (do you want to replicate data and/or components of your system, and why? The two primary reasons to replicate are reliability and performance)
consistency (if you share/replicate state, how is this state maintained so that system runs correctly?)
fault tolerance (in your system's design, what happens when a component fails, a network packet is lost, or network partition prevents components from communicating?)
security (what are the threats to your system's correctness/performance? You may need to think about what happens if one of the components in your system starts behaving maliciously. E.g., in an overlay network, what happens if a node does not route a message to the next-hop neighbor? can you do anything to mitigate these misbehaviors?)
synchronization (do your components need to be synchronized in time (physical/logical?)
performance (what is your goal in terms of scalability -- e.g., number of clients supported, amount of data processed per time unit, total amount of data managed, stored, etc.)

The idea is to think/reason about the above DS issues for your project and learn something from the experience. In your final report, you will note down what your design addresses and why, as well as what your design does NOT address and why. Legitimate reasons for NOT addressing something are: (the application does not need it, you assume out some aspects (e.g., your system is designed to target local-area-sized networks and hence you might assume network partition is unlikely), etc..