Posts tagged Research
I have been pretty lazy about blogging, partly because I wanted to write really thoughtful posts like Richard Lipton does, but it is really difficult to write long interesting essays. So, I have decided to take the easy way out in the 140 character Internet world.
I am going to list three very recent papers that I think every Systems researcher in the CS should be reading. I like these primarily because they are one of the few well written papers that illustrate and explain the ideas in a clear manner. I would even say that everyone in CS should read these, but I am sure theoreticians have better things to do. For the record, none of these directly relate to the research I am doing currently.
With that, here are the three papers.
- Reverse traceroute. Ethan Katz-Bassett, University of Washington; Harsha V. Madhyastha, University of California, San Diego; Vijay Kumar Adhikari, University of Minnesota; Colin Scott, Justine Sherry, Peter van Wesep, Thomas Anderson, and Arvind Krishnamurthy, University of Washington. [PDF]. NSDI 2010.
- Capsicum: practical capabilities for UNIX. Robert N. M. Watson, University of Cambridge; Jonathan Anderson, University of Cambridge; Ben Laurie, Google UK Ltd; Kris Kennaway, Google UK Ltd. [PDF]. USENIX Security 2010.
- The Case for Determinism in Database Systems. Alexander Thomson, Yale University; Daniel J. Abadi Yale University. [PDF]. VLDB 2010.
I haven’t grokked all the details of these three, especially the third one, which probably some will consider as a theory paper and is more dense. I hope these are not a waste of time, if they don’t fall into your direct field of research. If you have to choose just one out of these, I would go with the first one.
Virtualization is a hot topic for research, and with the rise of cloud computing, it has gained even more attention. Great number of research papers are being published in many major conferences. The following is an attempt to list some of the research work, and how they influenced real products or features. Some of the products have been major open source projects, before becoming purely commercial products. These are roughly in reverse-chronological order.
Note: I will be updating this over time. Let me know if you know of virtualization research papers that inspired or spawned real products.
|Research Topic||Research Paper||Commercial Product/Feature (s)|
|Storage Resource Management||PARDA: Proportional Allocation of Resources for Distributed Storage Access. Ajay Gulati, Irfan Ahmad, Carl A. Waldspurger. Published in the USENIX Conference on File and Storage Technologies (FAST ’09)||vSphere 4.1 Video of SIOC|
|End-host Network Virtualization||Crossbow Virtual Wire: Network in a Box. Sunay Tripathi, Nicolas Droux, Kais Belgaied, and Shrikrishna Khare. Large Installation System Administration Conference, 2009.||Crossbow in Solaris|
|Network Virtualization||OpenFlow: Enabling Innovation in Campus Networks. Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry Peterson, Jennifer Rexford, Scott Shenker, and Jonathan Turn. CCR 2008||OpenFlow|
|GPU virtualization||GPU Virtualization on VMware’s Hosted I/O Architecture. Micah Dowty, Jeremy Sugerman. Published in the USENIX Workshop on I/O Virtualization 2008||VMware desktop virtualization|
|NetChannel2||Netchannel 2: Optimizing Network Performance. J. Renato Santos, G. (John) Janakiraman, Yoshio Turner, Ian Pratt. Presented at Xen Summit, Apr 2007||NetChannel in Xen|
|IaaS platforms||Eucalyptus research project started by Rich Wolski||Eucalyptus|
|Performance profiling||Enforcing Performance Isolation Across Virtual Machines in Xen – Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, and Amin Vahdat. Proceedings of the 7th ACM/IFIP/USENIX Middleware Conference. Melbourne, Australia, Nov 2006||XenMon|
|Live migration||Live Migration of Virtual Machines – Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield. Published at NSDI 2005||Xen Live Migration|
|Para-virtualization||Xen and the Art of Virtualization – Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield. Published at SOSP 2003||Xen 1.x|
|Memory balooning||Memory Resource Management in VMware ESX Server – Carl Waldspurger. Published at OSDI ’02||ESX and Xen memory balooning|
|Virtual Machine Monitors||Disco: running commodity operating systems on scalable multiprocessors. Edouard Bugnion, Scott Devine, Kinshuk Govil, Mendel Rosenblum. SOSP 1997.||VMware workstation|
What is a client hypervisor? Client hypervisors are hypervisors that run on a user’s (client’s) desktop. A nice video showing Xen Client Hypervisor should clarify its usage. Server hypervisors like VMware’s ESX server, and XenServer virtualize a physical server that allow you to run multiple virtual machines. This has many benefits including reduced costs due to consolidation, easy management, and for load balancing.
Question is: Why would you want to virtualize your desktop? This use-case is not that clear, considering that desktops usually run only one operating system like Windows. Some benefits are
- Easy management. If all desktops in an enterprise are running in VMs, the administrators can easily upgrade all the VMs easily, assuming that all VMs are derived from a single gold image. This is not as simple as it sounds, if you allow the desktop users to install software, since the operating systems running VMs will diverge from the gold image.
- Saving energy. When a desktop running a VM becomes idle, one can easily migrate it to a server, and put the physical machine to sleep. Though, this is conceptually simple, there are many issues including reducing user disruption and consolidation of desktops on the server. See my LiteGreen work for a thorough evaluation of such system and how to solve some of the problems.
- Security. Technically, desktops running in VMs are more secure, since they don’t have direct access to hardware and can be monitored by hypervisor to prevent malicious activities. Anti-virus, anit-malware software can be installed in hypervisor, host operating system or management stub VMs, which can monitor the desktop VMs. This is complicated by the fact that VMMs or hypervisors do not have complete state of the operating systems running in VMs. Virtual Machine Introspection (VMI) is an on-going research field pursued by many security/virtualization researchers (Jiang’s work) that tries to solve these problems.
However, this is a two-edged sword, since hackers can build rootkits that run in a hypervisor too.
Another interesting usage of client hypervisors is to run different desktop applications in light-weight VMs (Qubes project) for better application isolation.
- Mobile Devices. This may be surprising, since mobile devices have usually less powerful hardware (that’s changing, however). The benefit comes from the fact that mobile devices may be able to run proprietary applications in a light-weight VM providing more security and support for legacy applications. The VMs can be moved to a cloud, if needed.
The biggest disadvantage of client hypervisors is: performance overhead and reduced user experience due to not having direct access to display hardware. For example, running a game like Quake in a VM, is not what you are looking for when you are buying powerful graphics cards. GPU virtualization is picking up, but will take some time to become a mature technology. Researchers are working towards allowing VMs to take advantage of graphics acceleration. GPU vendors have to provide hardware support for virtualizing the graphics processors similar to Intel’s VT-x extensions.
In virtualization community, live migration of virtual machines is pretty much considered a “default” mature feature in any hypervisor product. All major vendors like VMware, XenSource/Citrix, and Microsoft have products that support live migration. I consider it a major success for virtualizaton research community, since the first academic paper on it was published not so long ago in NSDI 2005. VMware’s vmotion technology (I think) predates this paper, but the technical details were largely unknown.
This post is partly inspired by some of the questions I received during a recent LiteGreen talk. Non-virtualization folks seem to misunderstand some of the aspects of live migration, so in this post, I will explain live migration and some “gotchas” in using it.
Alright, What is live migration? Migration of a virtual machine is simply moving the VM running on a physical machine (let’s call it source node) to another physical machine (let’s call it target node). The trick is to do this, while the VM is running on the source node, and without disrupting any active network connections even after the VM is moved to the target node. It is considered “live”, since the original VM is running, while the migration is in progress. Huge benefit of doing the live migration is the very small downtime in the order of milli seconds.
How to achieve live migration
To move a VM from the source node to the target node, we need to consider moving its cpu state, memory content, storage content, and network connections.
- Migrating CPU state has been extensively researched in the context of process migration. See Berkeley Lab Checkpoint/Restart (BLCR) for implementation details.
- Migrating memory content is a bit tricky, considering that the VM on the source node is still running and making modifications to the memory state. The idea is to do iterative copying of the memory contents, and send only the “delta” changes to the target node. There is a point, when only a small “delta” memory that needs to be copied. At this stage, the VM on the source node is paused, the delta memory is copied, and VM is resumed on the target node. The brief pause is what causes the downtime.
- Migrating storage content is similar to memory, but will require a lot more time, and the migration may take on the order of minutes. It may not be easy to guarantee small mill second downtime with storage migration. All current commercial products side-step this problem, by using centralized storage (e.g. NFS, iSCSI, Fibre Channel based SAN) that hosts VM images. Storage content doesn’t have to be migrated, if both source and target node are connected to the centralized storage. There are a few research solutions (1, 2, 3) that try to address this problem.
- Migrating network connections is simple, if you assume that all of the nodes are in the same IP subnet. When the VM is migrated to the target node, the VM simply has to send an ARP broadcast saying that the IP address has moved to a new physical (MAC address) location. Since this happens at the connection between Layer 2 and Layer 3 of network stack, transport layer is transparent to this change, and TCP connections survive the migration. As a result, applications see no disruption in network connections. Clearly, this approach doesn’t work, if the VMs have to cross subnets. There are many ways (1, 2) to solve this problem, but all of them have huge performance implications
The success and popularity of live migration lies in the fact that it has very small downtime, and some of the limitations I mentioned above are inherent to achieving that goal.
Benefits of live migration
One of the primary use-cases for live migration is for resource management in cloud computing. For example, cloud computing providers like Amazon EC2 have thousands of VMs running in their data centers. To save energy, cost and for load balancing, they can move VMs using live migration, without disrupting their customer applications running in the VMs. How to do this efficiently (or optimally) is a big research question, and some solutions (1) are available.
Frequently asked questions about live migraiton
Before I conclude, some answers to frequently asked questions regarding live migration.
- What is the usual VM downtime? The original Xen paper reported downtimes as small as 60ms for specific workloads. This is dependent on the application running in the VM, and network heavy applications may see disruptions, even if the downtime is small.
- What is the total migration time? This is where most people get confused, since the total time it takes to perform live migration is different from the downtime. It is usually in the order of a few seconds (ranging from 10-120 seconds). This time depends on how heavily the VM on the source node is modifying its memory content.
- How is live migration different from suspend/resume? Suspend/resume is similar to migration, since you can suspend a VM to a disk (on a centralized storage), and then resume the VM on a different physical machine. This works, but it is not “live”. Clearly, active network connections cannot be resumed in this model.
- How is live migration different from migrating to the cloud? This is an unfortunate source of confusion, since migration may mean different things to different people. When people are talking about migrating to the cloud, they usually are talking about moving your computation and data to a cloud like Amazon EC2 or Windows Azure. This is not live, and it may not really be VM migration.
We received HP’s open innovation research grant (co-authored the proposal) for next year. Yay!
This post is motivated by questions from some of the new grad students looking for research topics. Here are a few things that can help you in this quest.
Follow your passion
Often, when you take an undergrad course, you wonder about a few things. Why is this system designed like this? Why can’t I use that sorting algorithm? etc. If you are more curious, you probably use Google to find more information, and you develop an interest. As you learn more, you realize that among the sea of questions, there are a few that you really want to go after. Go for it!
State of the art is probably the most mis-used word in marketing, but in research it has a very concrete meaning. When some one says that they understand state of the art, that means they know every thing that’s been done so far, and the latest technology that probably is the best.
Good way to learn is to start reading latest publications from reputed conferences. For CS, journals are just treated as archival data, so it is not that important to read journals. Finding reputed conferences in a field is pretty easy. Good conferences can be found from Citeseer impact ratings, rankings maintained here and here.
The next step is to start reading publications. Reading papers is an art in itself and will require another long post. First, try to find papers that are of interest to you. One of the problems faced by newbies is that they don’t understand the paper fully, because some papers point to prior work that they are not aware of. You have to recursively read more papers pointing to prior/related work, and understand how people solved the problem. It’s a difficult and some times boring task, but you will get better at it.
In the end, you should be able to summarize the state of the art by listing the first work that tried to solve the problem to the latest. Often, there are solutions that have trade offs, and there may not be a single best solution. Your job is to figure out why some thing works and why some thing does not
You may even find that there is no solution, and there you have your research topic.
Broad and narrow enough
The topic you choose should be broad enough to puruse for 2 or 3 years. It should still be narrow enough for you to solve it completely. For example, building the next generation search is a topic for multiple PhDs, but developing better clustering algorithms might be a good topic.
Follow your advisor
: Advisors vary in how they assign topics to students. Some advisors have certain research topic in mind, and they want you to pursue that specific path. Some (like my advisor Kang Shin) give you complete freedom to choose a topic. There are benefits and downsides to both approaches.
If your advisor gives you a specific research topic, and if you like it, you are all set. You can start reading papers related to that topic, and start formulating a research problem. Your advisor is probably aware of the state of the art and might also have thought of possible solutions. The downside is that you are stuck with it, and you may feel like you are forced in a certain direction.
Free-form topic choosing gives you great flexibility. Its benefits are obvious, as you can choose whatever you like within certain area (say software systems). Some people (like me ) often have strong opinions on solving certain problems, and want to pursue them. However, you are bound to make some mistakes in the process. It is not always easy to judge state of the art, and it is not easy to find a concrete research problem in a big area. I originally wanted to work in grid scheduling, but I switched after reading a survey paper with 640 references. This surely will waste some of your time, but the process is highly rewarding.
Work with a fellow graduate student
If you are part of a research group (usually your adviser’s), then you will see senior graduate students who have already chosen a topic and/or close to graduation. They have done the hard work, and understand the pitfalls in research. Often, they can give you certain problem, which can be a PhD topic in itself. It is also a good way to learn more about an area that you are not familiar with. There might even be a continuation project to what your fellow student is doing. Downside is that the topic might be too narrow and may not be big enough for a PhD. Nevertheless, the experience is worth it.
Attend reading groups
Most universities have research students gathering in a group to discuss various research topics. Software systems reading group at UMich is a great place to mingle with other systems students. Usually, students, and some times guest speakers, professors present a paper and the group has a free flowing discussion about the work. You can even present a topic and gather opinions from your peers. You will also learn how to support your arguments, and how to criticize rationally. Also, free food is great
If your university does not have one, you can start one. It doesn’t take much effort to gather a few students with similar interests and start reading papers. Every one has to go through the latest publications, why not do it together?
Don’t get too attached
After you decide on a topic and start spending a few months learning the state of the art, you may find out that some one already solved it or it is too narrow. Don’t get too attached to the topic. It is OK to choose a new topic, and many PhD students change their core focus.
If all else fails, you can try the systems topic generator or
href=”http://www-cse.ucsd.edu/users/mihir/crypto-topic-generator.html”>crypto topic generator or classic Douglas Comer’s topic generator
New: Our tech report received 10,000 hits in just the month of May. See a screenshot of stats below. Yay!
Our recent tech report on performance evaluation of Xen vs. OpenVZ is generating quite a buzz. It’s the second most viewed tech report from outside HP. Virutal strategy magazine picked up on it and wrote an article. A few people including Simon Cosby, CTO of Xensource responded. Slashdot picked it up yesterday. I am pleasantly surprised that there is not much FUD generated by the Slashdot readers in the comments. Some of the comments are quite informative. Virtualization.info and OpenVZ blogs have written about it too.
Let me point out a few facts straight away. Both technologies have their place. Xen trades off performance for security and isolation and if you want to run multiple operating systems Xen or VMware is the only way. OS level container technologies like OpenVZ beat these hypervisor-based technologies easily on performance, but they may not be appropriate for some situations.
My friend Pranav wrote a blog post on a topic that we talked long ago:Is Theory better than Systems? As he mentioned, we both were passionate and some what arrogant in arguing the merits of one over the other.
I have matured a lot in the last few years, and I can claim to have done some real “Systems research” finally. Interesting post, Pranav, but it misses some important points, may be because you haven’t experienced them yet. Here it goes.
Definition: Discipline A is superior to discipline B if it improves the lives of a greater number of people.
Good definition. Though vague, it’s enough for our arguments. Let’s take your first point. CS Theory is much closer to Math, so it’s superior. There’s no doubt that it’s closer to Math, but how did it become superior ? It sure is easier to code than prove a theorem (may be), but systems research is not about coding or implementation.
Next, talking about first principles, you are absolutely right. If we were to build a system from first principles, it will certainly be more general and widely applicable. But, where are those first principles ? In my research, I (and my mentor Xiaoyun Zhu, a control theorist) argue that there are no first-principle based models in computer systems. If we had these models, we would simply throw those into an intelligent computer and it will spit out the system specifications for us. Coming back to superiority, Would you rather have years and years of useless results, that never get into practice or a system that works and may help in actually building those first principles ? If we are talking about improving the lives of people, a good working system actually brightens the lives of a lot more people including theory researchers. Let’s take a concrete example, sure NP =? P is a problem that is of extraordinary importance. Solving it would change the face of computer science. It surely is a lot more difficult problem than designing UNIX, in fact there’s no comparison. BUT, how did the theory researchers in the past 20 years help the lives of people ? Listing a few results like RSA algorithm doesn’t cut it. There are equally as significant results in Systems (UNIX, C, Ethernet, TCP/IP etc.)
In fact, systems research affects a lot more people than theory research. If we solve the problem of NP =? P, it surely will affect a lot more people, but in the process of proving that, theory researchers contribute very little to the lives of actual people. To give another example, it required systems people to tell all those theory people working for years on millions of scheduling algorithms that Round-robin works better than many others because of its simplicity. It would have been much simpler, if the first theorist who thought of a scheduling algorithm actually built a real system. We could have made better use of those “sharper” brains.
Definition: Discipline A is superior to discipline B if it is harder to publish a result in leading conferences/journals in discipline A than it is to publish an equivalent result in discipline B.
I had to have a big laugh over this. There are a lot more systems conferences/journals than theory conferences/journals because of its nature. It’s certainly easy to publish any random idea in any random conference probably even in theory research. It’s very difficult to publish in the “best” systems conferences as it is very difficult to publish in the best theory conferences. In fact, since more people are working in systems, it’s a lot more difficult to publish in systems. It takes a lot more work to design, build and evaluate a system compared to writing off a random theory idea. Theorists don’t have to prove that it really works, and there will be very few people who actually know the nitty, gritty details of the proof some one worked on. On the other hand, if you write a systems paper, it will be torn to pieces by many researchers who have actually built similar systems. The devil is in the details.
Let me digress a little bit. Long ago, Pranav talked about how easy it is to read a systems paper for any one working in computer science where as a theory paper would not even be understandable to many. There lies the superiority of systems, I would say. It sure is easy for any one to look at the new system, use the system, even experiment with the system. This actually improves the system a lot considering that every computer scientist can understand what it’s doing. BUT, It really takes brilliant systems researchers to design and build it. Many computer science people (including a few systems researchers) don’t actually understand the real details of building a system that actually works. These are NOT implementation details, far from it.
To conclude, there seems to be some confusion over the superiority of results obtained in Systems and Theory. There’s absolutely no doubt that the results that Theory brings once in a while are extraordinary and very significant. BUT, they are few and come out rarely. As Pranav mentioned, I have utmost respect for any one who really believes that they can solve these fundamental problems. While theorists ponder over these grand problems, there are equally tough problems in systems. There is no corresponding theory for them, because there are no first principles. As a result, significant systems results also come at a very slow rate, but systems research contributes a lot more to the good of computer science and actually helps the lives of common people.
Philosophically speaking, both theorists and systems researchers are working to make computer science better, systems researchers contribute to smaller problems a lot more while doing this than theory researchers.
HP Labs has been very generous and sent us 16 C-class blades worth about 150K$. My professor and I were pleasantly surprised when we got the gift. The blades are awesome and are best of the breed. The blades, I am currently using in HP labs are actually older ! These blades are quite powerful and can be totally managed remotely with iLO, once setup.
Obviously, we are being very careful in setting it up here. For past few days, my colleague Howard and I are looking at the power, cooling requirements, rack setup etc. It’s going to take some time to set it all up. We plan to build a mini data center. Yay!
I am giddy, yes, giddy with excitement at seeing the mail from EuroSys chair that my paper to EuroSys 2007 is accepted. It’s one of the good systems conferences and is considered to be reaching the quality of OSDI/SOSP (the top systems conference). Also, another paper of mine co-authored with HP folks is accepted to IM 2007. It’s a sub-project originated from the main EuroSys paper.
It was a lot of hard work, as usual, but it’s worth it. Next stop is SOSP ! To put it in my HP colleague Mustafa’s words
… doing a SOSP paper with CPU+IO+network would be more important now that we have the genie out of the box
Go go go …