There will be up to 5 points in the final grade for each winner. This contest consists of two stages: Stage 1 -- design of your method and stage 2 -- prototype implementation and demonstration. You can earn up to 2 points in stage and up to 3 points in stage 2.
(New!!!) Considering the difficulty of the implementation, I will give 2 more extra points to everyone in the champion team.
Important Deadlines and Dates
- 11:45pm on 10/23/09: Report for stage 1 due.
- 11:45pm on 11/13/09: Your impelmentation for stage 2.
- 11/19/09: demos and challenges. Time TBD.
Virtual cloud computing is emerging as a promising solution to Information Technology (IT) management
to both ease the provisioning and administration of complex hardware and software systems and reduce the
operational costs. Several industry and university leaders have presented recently possible implementations.
These include IBM (Blue Cloud), Google, Amazon (Elastic Cloud Computing – EC2), Microsoft (Azure),
and NCSU (Virtual Computing Lab – VCL).
As more Virtual Machines (VMs) are being used, particularly because VMs are easy to be cloned and snapshotted, it is unavoidable to have a large number of VMs. Management and storage of these VM images will become an issue.
Mirage is a VM image management system developed to address this issue. Mirage treats VM images as structured data, stored in a centrally managed repository. The following quote from  explains the general idea behind Mirage:
"A new storage format, the Mirage image format (MIF), exposes the rich semantic information currently buried in disk-image files. Disk-image files contain an implicit mapping from file name to file content (and file metadata). To access this mapping, one must have the complete image— for some tasks, the image must be started. By contrast, MIF decouples this mapping into a manifest that maps file names to content descriptors (and metadata) and a store that holds content."
One outcome of this design is that when one adds a VM image into Mirage and later check it out, the physical (bytes) layout in the checked-out VM image is not necessarily the same as the one originally checked in, though logically this is still the same image (i.e., same files in the same directory structure). This brings a problem for security: There is a need to be able to sign a VM image and then later allow users who check out the image to verify the signature. If the physical layout changes, simply signing a VM image as a file will not work.
Develop a method to sign a VM image that will recognize images checked out of Mirage as valid. In other words, if all files and directory structures in two separate images are exactly the same, both images should be considered valid and the signature should be valid for both of them. Otherwise, the signature should be invalid.
Submit a report describing your method. You report should have no less than 1 inch margin, no less than 10 pt font, and at most 3 pages. You should have at least 1 figure to illustrate your method.
Winning reports will be posted on the course website.
Implement a tool using any solution you pick. You can pick any of the posted methods, or develop one by yourself. In the latter case, you can reuse anything you have learned from the winning reports. You will be given access to VM images checked out of Mirage to test your programs.
- You have to work individually for stage 1, though discussion is encouraged.
- The first 10 individuals who submit a working solution will get 2 points each. A submission link will be created by the TA. Note that even if you are not among the first 10 students who submit a solution, you could still win if some of the earlier solutions are incorrect.
- You will work in teams in stage 2, where each team has up to 3 students.
- (Updated!) There are multiple types of vmdk files. In this contest, let's only consider raw vmdk files with ext3 file systems.
- Each team in stage 2 will play both offense and defense.
- In terms of defense, each team needs to implement a signing and verification tool that signs any given VM image and verifies the same image checked out from Mirage.
- The objective of offence is to create fake images that will be considered valid by other teams' tools. A reference VM image will be given to each team for offense purposes.
- Each team needs to modify the reference image in arbitrary ways to make fake images. For each fake image, the team needs to clearly explain the difference between the fake one and the original one.
- Each team is allowed to submit up to 10 fake images.
- A signing tool is considered valid if
- It signs and verifies images checked-in and -out of Mirage successfully, and
- It rejects the fake images given by the offense teams or the TA.
- Contest in stage 2 will be done using a point system.
- After the submission deadline, we will rank the teams based on the timestamps of their submissions in ascending order. Each team will get a base point, which is 3 x # of teams behind it. For example, if there are 10 teams in the contest, the first team will get 27 points, the second team will get 24 points, etc.
- All teams will have to sign an image and then verify the signature with another version of the image checked out of Mirage. Teams that fail to provide a valid signature for both images will drop out of the contest automatically.
- For the remaining teams, each team can challenge any other team if the former hasn't challenged the latter. Then the two teams will take turns to test their fake images against each other. If one team fails to recognize a fake image provided by the other team, this team loses 5 points to the second team. That is, the losing team will have 5 fewer points, and the winning team will have 5 more points.
- If a challenged team refuses to accept the challenge, it loses 25 points to the challenging team.
- Each individual student in the top 5 teams will get 3 extra points in their final grade in CSC/ECE 574.
You are provided 5 VM images: the original image and four checked out from Mirage. Each image has two files: a file named "esx3-CentOS-v0.vmdk", which consists ofthe meta informaiton and a file named "esx3-CentOS-v0-flat.vmdk", which has the actual data (ext3 raw format). Decompress and untar each .tgz file (e.g., "tar xvfz Original.tgz"), and you will get a directory with those two files. Focus on esx3-CentOS-v0-flat.vmdk.
Note that if you start the VM in VMWare, the vmdk file will be changed and thus no longer valid. Make sure you work on the ones you download from the course website.
Additional Information for On-site Contest
- For the on-site competition, please prepare up to 10 fake images. Name these images as <team name>-n-flat.vmdk. For example, team anatara's first image should be anatara-1-flat.vmdk. For each of your fake image, you have to clearly justify why it is fake in a written document. Name this file as <team name>-justification.txt (e.g., anatara-justification.txt). Please use a plain text file.
- Each team should bring a laptop with your program installed on it. We will use your laptop for your own program.
-  Darrell Reimer, Arun Thomas, Glenn Ammons, Todd Mummert, Bowen Alpern, and Vasanth Bala. Opening Black Boxes: Using Semantic Information to Combat Virtual Machine Image Sprawl. in Proceedings of the 2008 ACM International Conference on Virtual Execution Environments (VEE 2008).
Resources on VM image formats and Suggestions
- A script used to compare if two images are functionally equivalent. (Wrap protected) Note that this does not consider security threats. You can use it as a starting point, but don't count on it for security.
- Try to reuse existing file system tools (e.g., use fdisk to parse the partition table, mount the image to facilitate parsing).
- VMDK file specification (wrap protected)
- You may consider using VMWare Virtual Disk Development Kit. But you will have to parse ext3 filesystem by yourself. Don't try this if you are not sure what you are dealing with.
To be posted.