Virtually the best blog on the web!
GENI
Why do you Need Snapshots for Distributed Experiments?
Jul 23rd
We have recently demonstrated GENI-VIOLIN, a suspend/resume feature for GENI experiments provided by live snapshots, at the GENI conference (GEC8). This project is in collaboration with Purdue University, and we leverage their existing work VNSNAP built on top of VIOLIN.
One of the most common questions is, why do you really need it? Well, there are three big reasons.
- Fault Tolerance: If you are taking periodic “live” snapshots of your experiments, when a hardware failure happens, you can simply rollback to the latest consistent snapshot, and resume your experiment. This is especially critical for long-running jobs, since you don’t want to re-run the entire experiment, wasting more resources and time.
- Debuggin: How often do you see distributed experiments failing, and you have no way of recreating the problem. The difficulty is partly because a distributed system is lot more non-deterministic than a single system. What if you could go back in time, and step through the experiment, like debugging a single processor application? Well, that’s what snapshots are for. You can simply go back in time, replay the experiment and hope to debug the problem.
- Management: This one is a favorite with data center operators. No matter how many resources, you provision, there will always be a time, when you want to run a higher priority job quickly, you run into resource shortage. You don’t want to stop existing experiments/jobs, because you lose all the work done by these jobs. What if you could simply suspend them for some time, run the higher priority job, and resume them later? That’s what GENI-VIOLIN provides.
I want to point out that, snapshot of a single virtual machine is pretty easy to do, and is available commercially right now. The difficulty in taking snapshots of distributed experiments is the problem of dealing with network state. GENI-VIOLIN provides a solution that can generate consistent snapshots for distributed experiments. To learn more about GENI-VIOLIN, see a video of our GEC8 demo below.