Virtually the best blog on the web!
Research
Computer Science Conferences – Statistics and Acceptance Rates
Feb 9th
There are a bunch of sites with statistics on computer science conferences, but it always bothered me that there was no central place to search for conference statistics like acceptance rates. I also wanted to have a visual display of statistics similar to Google Finance. In this age of AJAX and Web 2.0, the web page should also be very fluid.
My first attempt to build such a web page is here. Right now, it scrapes the data from Kevin Almeroth’s excellent stats page and presents it in a visual form. The scraping is still a bit rough, so there might be some errors. I would like to add more features like comparing conferences, searching and adding more conferences.
Leave your comments and suggestions below.
Virtualization Bibliography
Feb 2nd
Virtualization is a hot topic and the field is growing at a rapid pace. There are hundreds of papers on each sub-topic, and this list is intended to be a good starting point for someone starting in virtualization research.
Surveys/Books
These are good starting points, if you are just learning about virtualization
- Survey of virtualization machine research. Robert P. Goldberg. IEEE Computer, June 1974, pp 34-45. [PDF]. One of the oldest surveys about virtualization research.
- A Survey on Virtualization Techniques. Susanta Nanda and Tzi-cker Chiueh [PDF]. A more modern survey of virtualization techniques. The paper lists many of the techniques, but doesn’t really explain how they are different etc. Yet, this is a good read for budding virtualization researchers.
- Virtual Machines: Versatile Platforms for Systems and Processes . Jim Smith Ravi Nair. Amazon book link
Virtualization Overview
These are the papers that will help in understanding fundamentals and concepts of virtualization.
- Xen and the art of virtualization. Barham et.al. This is the classic SOSP paper on para-virtualization. [PDF].
- When Virtual is Better Than Real. Peter Chen and Brian Noble. This is a great short article explaining the benefits of virtualization. ACM link.
- Disco: Running Commodity Operating Systems on Scalable Multiprocessors. E. Bugnion, S. Devine, and M. Rosenblum. This paper is considered to be the first paper that revived the virtualization concepts pioneered by IBM. ACM link.
- Running multiple operating systems concurrently on an IA32 PC using virtualization techniques. Kevin Lawton. This is a great article on the difficulties involved in virtualizing the x86 platform. link.
- Formal requirements for virtualizable third generation architectures. Popek and Goldberg. The classic paper explaining the requirements for virtualizing a specific ISA. ACM link. If you want light reading, check the Wikipedia explanation of the requirements.
- Scale and Performance in the Denali Isolation Kernel. Andrew Whitaker, Marianne Shaw, and Steven D. Gribble. Denali is another example of early virtualization papers that rejuvenated the research. Denali shows the containers (or vservers) concept, which is used in OpenVZ, KVM and VServers.
- The Exokernel Operating System Architecture. Dawson Engler’s thesis. [PS]. This is more of OS research, but a great read for understanding some of the techniques (thin hypervisor, pass through etc.) that are used in modern virtualized systems.
CPU Virtualization
Virtualization of CPU is provided by a CPU scheduler that provides the illusion of multiple CPUs (or VCPUs). Scheduling has a long and rich history. Below are a few links in relation to virtualization.
- Xen’s credit scheduler. Xen’s credit scheduler is a proportional fair scheduler, which is pretty big improvement (especially in SMP environments) over the old SEDF scheduler.
- Xen’s scheduler is partly inspired by Carl Waldspurger’s lottery scheduling and the paper: Lottery Scheduling: Flexible Proportional-Share Resource Management by Carl Waldspurger is a must read. [PDF].
- Comparison of the Three CPU Schedulers in Xen – Ludmila Cherkasova, Diwaker Gupta, Amin Vahdat – Perfomance Evaluation Review. Vol 35, Number 2. Sept 3, 2007. This provides a great comparison of CPU schedulers and provides insight on how handling I/O is complicated. [PDF].
Network Virtualization
Network virtualization is gaining momentum in recent years. I would just point you to the following survey as a starting point.
- Survey of network virtualization. N.M. Mosharaf Kabir Chowdhury and Raouf Boutaba. [PDF]. This is pretty comprehensive and well-written survey on network virtualization. Mosharaf also has a nice network virtualization, listing lots of useful papers.
Device Virtualization
- Xen’s device driver virtualization.
- Virtualizing I/O Devices on VMware Workstation’s Hosted Virtual Machine Monitor. Jeremy Sugerman, Ganesh Venkitachalam, and Beng-Hong Lim. As the name suggests, this paper explains VMware Workstation’s mechanisms for virtualizing devices. [PDF].
- GPU Virtualization on VMware’s Hosted I/O Architecture. Micah Dowty, Jeremy Sugerman. This paper by VMware folks explains how VMware implements GPU virtualization. Great starting point for GPU virtualization. [PDF]. Andres’ paper on VMM indepedent graphics acceleration is a good follow-up reading.
Memory Virtualization
- Memory Resource Management in VMware ESX Server. Carl A. Waldspurger. [PDF]. Classic paper on memory virtualization. This paper introduces memory ballooning, which is a great technique used in commercial platforms as well.
- Difference engine: harnessing memory redundancy in virtual machines. Diwakar Gupta et.al. [PDF]. A great follow up paper that talks about sharing memory across VMs. Incidentally, both these papers have received best paper awards.
Migration/Cloning
- Live Migration of Virtual Machines. Clark et.al. One of the first academic papers discussing how to do live migration, which is implemented in Xen. VMware has supposedly implemented vmotion before this was published. [PDF]
- The design and implementation of Zap: a system for migrating computing environments. Osman et.al. Zap predates Xen paper and talks about sandboxing processes that can be migrated. This is more like container virtualization rather than full system virtualization. ACM link
- SnowFlock: rapid virtual machine cloning for cloud computing. Andres Lagar-Cavilla et.al. Greatp paper to read to understanding cloning of virtual machines. Snowflock link.
Resource Management/Automation
- Carl Waldspurger’s PhD thesis is a great place to start for understanding resource management. MIT link.
- Automated control of multiple virtualized resources. Padala et.al. Disclaimer: this is my own paper
It’s a good place to start if you want to learn about automating resource management for virtual machines. The control theory aspects are a bit complex, but you can ignore them and focus on the issues in automating resource management. [PDF]. - Black-box and Gray-box Strategies for Virtual Machine Migration. Clark et.al. This is a follow up for live migration paper. The paper talks about strategies for automating the migration of VMs to meet specific goals. USENIX link.
Paper Reading Recommendations for Systems Folks
Sep 3rd
I have been pretty lazy about blogging, partly because I wanted to write really thoughtful posts like Richard Lipton does, but it is really difficult to write long interesting essays. So, I have decided to take the easy way out in the 140 character Internet world.
I am going to list three very recent papers that I think every Systems researcher in the CS should be reading. I like these primarily because they are one of the few well written papers that illustrate and explain the ideas in a clear manner. I would even say that everyone in CS should read these, but I am sure theoreticians have better things to do. For the record, none of these directly relate to the research I am doing currently.
With that, here are the three papers.
- Reverse traceroute. Ethan Katz-Bassett, University of Washington; Harsha V. Madhyastha, University of California, San Diego; Vijay Kumar Adhikari, University of Minnesota; Colin Scott, Justine Sherry, Peter van Wesep, Thomas Anderson, and Arvind Krishnamurthy, University of Washington. [PDF]. NSDI 2010.
- Capsicum: practical capabilities for UNIX. Robert N. M. Watson, University of Cambridge; Jonathan Anderson, University of Cambridge; Ben Laurie, Google UK Ltd; Kris Kennaway, Google UK Ltd. [PDF]. USENIX Security 2010.
- The Case for Determinism in Database Systems. Alexander Thomson, Yale University; Daniel J. Abadi Yale University. [PDF]. VLDB 2010.
I haven’t grokked all the details of these three, especially the third one, which probably some will consider as a theory paper and is more dense. I hope these are not a waste of time, if they don’t fall into your direct field of research. If you have to choose just one out of these, I would go with the first one.
Computer Systems Conference Rankings
Jul 12th
Update: You can see conference statistics in graphical form here
Conference rankings are a contentious topic, and it is often difficult to directly compare two conferences, because each conference has unique flavor, community and history. In general, it’s easy to identify top 1 or 2 conferences in a field, but it gets murky as you go down. There are a bunch of webpages (see below), where you can look at the rankings. About the field of computer systems, defining it is another blog post. For now, read the Section 3 of Eurosys.org white paper, and Liviu’s presentation on what constitutes good systems research..
This post is my view on the systems conference rankings. Disclaimer: these are just my personal opinions. Quality of conferences varies over time, and these are not set in stone. In my opinion, all the conferences I mentioned here are reputed and great conferences to publish in.
For general systems (operating systems, distributed systems)
| Conference name | Acceptance rate | Notes |
|---|---|---|
| OSDI / SOSP | 11.8% – 21% | These two conferences, which alternate every year are widely considered as the top conferences in systems. SOSP has rich history, and is considered slightly more prestigious. Both are highly selective and are very influential in “real” systems. |
| Eurosys | 16.9% – 21% | Eurosys is still a new conference, but is quickly becoming a premier systems conference. Though, it is hosted in Europe every year, published papers are from all around the world, with majority of them coming from systems research groups in the US. |
| USENIX Annual Technical Conference | 16.3% – 30% | USENIX ATC is another great conference, with lots of good systems work. It’s quality has varied over time, and is slightly inconsistent. |
For networked systems
| Conference name | Acceptance rate | Notes |
|---|---|---|
| SIGCOMM | 10% | Widely considered as the most difficult conference (in networking) to publish in, SIGCOMM has rich history, and great papers every year. Earlier SIGCOMM conferences have accepted more papers, but the acceptance rates have dropped to 10-12% in the past few years. |
| NSDI | 19 – 21% | Compared to SIGCOMM, NSDI has the interesting flavor of systems and networking. |
| CoNEXT | 19 – 21% | CoNEXT is similar to Eurosys. It has a bit more European flavor to it, since it is hosted in Europe in alternate years. It is also a new conference compared to SIGCOMM and NSDI. |
Other systems-flavored conferences
- For file systems, and storage, FAST is considered a top conference.
- For programming languages, PLDI, POPL are considered top conferences.
- For the combination of OS + architecture or OS + programming languages ASPLOS is considered the top conference. It usually has multi-disciplinary flavor to it.
Now, the fun part.
- Is INFOCOM a good networking conference for systems work? This is a tricky question, since INFOCOM traditionally has accepted more theory/algorithms/analysis/simulation oriented papers, with a bent on wireless systems. INFOCOM has low acceptance rates in the range of 16-20%, but due to large number of tracks, the quality sometimes is patchy. INFOCOM does accept systems papers.
- Is NSDI better than Eurosys for networking + systems work? Another interesting question to ponder during lunch breaks
I think, if a project has more networking flavor to it, one should consider submitting it to NSDI. On the other hand, if the project is purely systems work, then Eurosys is a better venue. - What is the criterion for deciding the rankings of conferences? Should we use acceptrance rate, or citation count or industry influence or a combination of them?
Some useful links for conference rankings.
- Citseer impact factor – Estimated impact of publication venues in Computer Science.
- Citeseerx impact factor – Estimated Venue Impact Factors
- Conference Stats – acceptance rates, tracks, attendees etc.
- Computer Science Conference Rankings – Osmar R. Zaïane’s list
Virtualization Research Projects that Spawned Real Products
Jul 7th
Virtualization is a hot topic for research, and with the rise of cloud computing, it has gained even more attention. Great number of research papers are being published in many major conferences. The following is an attempt to list some of the research work, and how they influenced real products or features. Some of the products have been major open source projects, before becoming purely commercial products. These are roughly in reverse-chronological order.
Note: I will be updating this over time. Let me know if you know of virtualization research papers that inspired or spawned real products.
| Research Topic | Research Paper | Commercial Product/Feature (s) |
|---|---|---|
| Storage Resource Management | PARDA: Proportional Allocation of Resources for Distributed Storage Access. Ajay Gulati, Irfan Ahmad, Carl A. Waldspurger. Published in the USENIX Conference on File and Storage Technologies (FAST ’09) | vSphere 4.1 Video of SIOC |
| End-host Network Virtualization | Crossbow Virtual Wire: Network in a Box. Sunay Tripathi, Nicolas Droux, Kais Belgaied, and Shrikrishna Khare. Large Installation System Administration Conference, 2009. | Crossbow in Solaris |
| Network Virtualization | OpenFlow: Enabling Innovation in Campus Networks. Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry Peterson, Jennifer Rexford, Scott Shenker, and Jonathan Turn. CCR 2008 | OpenFlow |
| GPU virtualization | GPU Virtualization on VMware’s Hosted I/O Architecture. Micah Dowty, Jeremy Sugerman. Published in the USENIX Workshop on I/O Virtualization 2008 | VMware desktop virtualization |
| NetChannel2 | Netchannel 2: Optimizing Network Performance. J. Renato Santos, G. (John) Janakiraman, Yoshio Turner, Ian Pratt. Presented at Xen Summit, Apr 2007 | NetChannel in Xen |
| IaaS platforms | Eucalyptus research project started by Rich Wolski | Eucalyptus |
| Performance profiling | Enforcing Performance Isolation Across Virtual Machines in Xen – Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, and Amin Vahdat. Proceedings of the 7th ACM/IFIP/USENIX Middleware Conference. Melbourne, Australia, Nov 2006 | XenMon |
| Live migration | Live Migration of Virtual Machines – Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield. Published at NSDI 2005 | Xen Live Migration |
| Para-virtualization | Xen and the Art of Virtualization – Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield. Published at SOSP 2003 | Xen 1.x |
| Memory balooning | Memory Resource Management in VMware ESX Server – Carl Waldspurger. Published at OSDI ’02 | ESX and Xen memory balooning |
| Virtual Machine Monitors | Disco: running commodity operating systems on scalable multiprocessors. Edouard Bugnion, Scott Devine, Kinshuk Govil, Mendel Rosenblum. SOSP 1997. | VMware workstation |
HP research grant awarded!
Aug 14th
We received HP’s open innovation research grant (co-authored the proposal) for next year. Yay!
How to Choose a CS Research Topic?
Jun 13th
This post is motivated by questions from some of the new grad students looking for research topics. Here are a few things that can help you in this quest.
Follow your passion
Often, when you take an undergrad course, you wonder about a few things. Why is this system designed like this? Why can’t I use that sorting algorithm? etc. If you are more curious, you probably use Google to find more information, and you develop an interest. As you learn more, you realize that among the sea of questions, there are a few that you really want to go after. Go for it!
Learn state-of-the-art
State of the art is probably the most mis-used word in marketing, but in research it has a very concrete meaning. When some one says that they understand state of the art, that means they know every thing that’s been done so far, and the latest technology that probably is the best.
Good way to learn is to start reading latest publications from reputed conferences. For CS, journals are just treated as archival data, so it is not that important to read journals. Finding reputed conferences in a field is pretty easy. Good conferences can be found from Citeseer impact ratings, rankings maintained here and here.
The next step is to start reading publications. Reading papers is an art in itself and will require another long post. First, try to find papers that are of interest to you. One of the problems faced by newbies is that they don’t understand the paper fully, because some papers point to prior work that they are not aware of. You have to recursively read more papers pointing to prior/related work, and understand how people solved the problem. It’s a difficult and some times boring task, but you will get better at it.
In the end, you should be able to summarize the state of the art by listing the first work that tried to solve the problem to the latest. Often, there are solutions that have trade offs, and there may not be a single best solution. Your job is to figure out why some thing works and why some thing does not
work.
You may even find that there is no solution, and there you have your research topic.
Broad and narrow enough
The topic you choose should be broad enough to puruse for 2 or 3 years. It should still be narrow enough for you to solve it completely. For example, building the next generation search is a topic for multiple PhDs, but developing better clustering algorithms might be a good topic.
Follow your advisor
: Advisors vary in how they assign topics to students. Some advisors have certain research topic in mind, and they want you to pursue that specific path. Some (like my advisor Kang Shin) give you complete freedom to choose a topic. There are benefits and downsides to both approaches.
If your advisor gives you a specific research topic, and if you like it, you are all set. You can start reading papers related to that topic, and start formulating a research problem. Your advisor is probably aware of the state of the art and might also have thought of possible solutions. The downside is that you are stuck with it, and you may feel like you are forced in a certain direction.
Free-form topic choosing gives you great flexibility. Its benefits are obvious, as you can choose whatever you like within certain area (say software systems). Some people (like me
) often have strong opinions on solving certain problems, and want to pursue them. However, you are bound to make some mistakes in the process. It is not always easy to judge state of the art, and it is not easy to find a concrete research problem in a big area. I originally wanted to work in grid scheduling, but I switched after reading a survey paper with 640 references. This surely will waste some of your time, but the process is highly rewarding.
Work with a fellow graduate student
If you are part of a research group (usually your adviser’s), then you will see senior graduate students who have already chosen a topic and/or close to graduation. They have done the hard work, and understand the pitfalls in research. Often, they can give you certain problem, which can be a PhD topic in itself. It is also a good way to learn more about an area that you are not familiar with. There might even be a continuation project to what your fellow student is doing. Downside is that the topic might be too narrow and may not be big enough for a PhD. Nevertheless, the experience is worth it.
Attend reading groups
Most universities have research students gathering in a group to discuss various research topics. Software systems reading group at UMich is a great place to mingle with other systems students. Usually, students, and some times guest speakers, professors present a paper and the group has a free flowing discussion about the work. You can even present a topic and gather opinions from your peers. You will also learn how to support your arguments, and how to criticize rationally. Also, free food is great
If your university does not have one, you can start one. It doesn’t take much effort to gather a few students with similar interests and start reading papers. Every one has to go through the latest publications, why not do it together?
Don’t get too attached
After you decide on a topic and start spending a few months learning the state of the art, you may find out that some one already solved it or it is too narrow. Don’t get too attached to the topic. It is OK to choose a new topic, and many PhD students change their core focus.
If all else fails, you can try the systems topic generator or
href=”http://www-cse.ucsd.edu/users/mihir/crypto-topic-generator.html”>crypto topic generator or classic Douglas Comer’s topic generator
Our paper in CDC (Top tier control theory conference)
Jul 30th
Today’s news is that one of our papers got accepted in IEEE Conference on Decision and Control, one of the oldest and most prestigious control theory conference out there. The paper that got accepted is about our MIMO (multiple input and multiple output) control and the math in this paper forms the basis for my work this summer. We have extended it quite a bit and the next paper is going to be a computer systems paper with the theory used in a real system and hopefully will be really exciting!
Stay tuned!
P.S. I am landing in Ann Arbor on Aug 4th morning. See you AAites soon.
More press to our work
Jun 4th
New: Our tech report received 10,000 hits in just the month of May. See a screenshot of stats below. Yay!
Our recent tech report on performance evaluation of Xen vs. OpenVZ is generating quite a buzz. It’s the second most viewed tech report from outside HP. Virutal strategy magazine picked up on it and wrote an article. A few people including Simon Cosby, CTO of Xensource responded. Slashdot picked it up yesterday. I am pleasantly surprised that there is not much FUD generated by the Slashdot readers in the comments. Some of the comments are quite informative. Virtualization.info and OpenVZ blogs have written about it too.
Let me point out a few facts straight away. Both technologies have their place. Xen trades off performance for security and isolation and if you want to run multiple operating systems Xen or VMware is the only way. OS level container technologies like OpenVZ beat these hypervisor-based technologies easily on performance, but they may not be appropriate for some situations.
Is Theory “better” than Systems ?
Apr 14th
My friend Pranav wrote a blog post on a topic that we talked long ago:Is Theory better than Systems? As he mentioned, we both were passionate and some what arrogant in arguing the merits of one over the other.
I have matured a lot in the last few years, and I can claim to have done some real “Systems research” finally. Interesting post, Pranav, but it misses some important points, may be because you haven’t experienced them yet. Here it goes.
Pranav says,
Definition: Discipline A is superior to discipline B if it improves the lives of a greater number of people.
Good definition. Though vague, it’s enough for our arguments. Let’s take your first point. CS Theory is much closer to Math, so it’s superior. There’s no doubt that it’s closer to Math, but how did it become superior ? It sure is easier to code than prove a theorem (may be), but systems research is not about coding or implementation.
Next, talking about first principles, you are absolutely right. If we were to build a system from first principles, it will certainly be more general and widely applicable. But, where are those first principles ? In my research, I (and my mentor Xiaoyun Zhu, a control theorist) argue that there are no first-principle based models in computer systems. If we had these models, we would simply throw those into an intelligent computer and it will spit out the system specifications for us. Coming back to superiority, Would you rather have years and years of useless results, that never get into practice or a system that works and may help in actually building those first principles ? If we are talking about improving the lives of people, a good working system actually brightens the lives of a lot more people including theory researchers. Let’s take a concrete example, sure NP =? P is a problem that is of extraordinary importance. Solving it would change the face of computer science. It surely is a lot more difficult problem than designing UNIX, in fact there’s no comparison. BUT, how did the theory researchers in the past 20 years help the lives of people ? Listing a few results like RSA algorithm doesn’t cut it. There are equally as significant results in Systems (UNIX, C, Ethernet, TCP/IP etc.)
In fact, systems research affects a lot more people than theory research. If we solve the problem of NP =? P, it surely will affect a lot more people, but in the process of proving that, theory researchers contribute very little to the lives of actual people. To give another example, it required systems people to tell all those theory people working for years on millions of scheduling algorithms that Round-robin works better than many others because of its simplicity. It would have been much simpler, if the first theorist who thought of a scheduling algorithm actually built a real system. We could have made better use of those “sharper” brains.
Pranav says,
Definition: Discipline A is superior to discipline B if it is harder to publish a result in leading conferences/journals in discipline A than it is to publish an equivalent result in discipline B.
I had to have a big laugh over this. There are a lot more systems conferences/journals than theory conferences/journals because of its nature. It’s certainly easy to publish any random idea in any random conference probably even in theory research. It’s very difficult to publish in the “best” systems conferences as it is very difficult to publish in the best theory conferences. In fact, since more people are working in systems, it’s a lot more difficult to publish in systems. It takes a lot more work to design, build and evaluate a system compared to writing off a random theory idea. Theorists don’t have to prove that it really works, and there will be very few people who actually know the nitty, gritty details of the proof some one worked on. On the other hand, if you write a systems paper, it will be torn to pieces by many researchers who have actually built similar systems. The devil is in the details.
Let me digress a little bit. Long ago, Pranav talked about how easy it is to read a systems paper for any one working in computer science where as a theory paper would not even be understandable to many. There lies the superiority of systems, I would say. It sure is easy for any one to look at the new system, use the system, even experiment with the system. This actually improves the system a lot considering that every computer scientist can understand what it’s doing. BUT, It really takes brilliant systems researchers to design and build it. Many computer science people (including a few systems researchers) don’t actually understand the real details of building a system that actually works. These are NOT implementation details, far from it.
To conclude, there seems to be some confusion over the superiority of results obtained in Systems and Theory. There’s absolutely no doubt that the results that Theory brings once in a while are extraordinary and very significant. BUT, they are few and come out rarely. As Pranav mentioned, I have utmost respect for any one who really believes that they can solve these fundamental problems. While theorists ponder over these grand problems, there are equally tough problems in systems. There is no corresponding theory for them, because there are no first principles. As a result, significant systems results also come at a very slow rate, but systems research contributes a lot more to the good of computer science and actually helps the lives of common people.
Philosophically speaking, both theorists and systems researchers are working to make computer science better, systems researchers contribute to smaller problems a lot more while doing this than theory researchers.