Virtually the best blog on the web!
Hacking
y!Vmail – second prize at Y! HackU event
Mar 24th
Arnab came up with a cool idea called y!Vmail, voice mail for your Yahoo! Mail, and we presented it at the Y! HackU event and won the second prize! Arnab has a detailed blog post with more pictures, demo video, and an entertaining description of how we hacked.

Y!Vmail team with judges Paul Tarjan and Rasmus Lerdorf
Bjarne Stroustrup’s interview about C++
Dec 5th
MIT Tech Review has this article about Stroustrup’s views on current state of C++, programming etc. Stroustrup, the C++ inventor, seemed to answer quite candidly.
One particular comment is interesting.
That said, C++ has indeed become too “expert friendly” at a time where the degree of effective formal education of the average software developer has declined. However, the solution is not to dumb down the programming languages but to use a variety of programming languages and educate more experts. There has to be languages for those experts to use–and C++ is one of those languages.
It’s true that C++ is actually expert friendly and is some times intimidating to a newbie programmer. It’s been a while since I did “real” object-oriented programming in C++, but whenever I tried to use it, I felt like it had too much power. This quote from Stroustrup himself puts it succintly.
It’s easy to shoot yourself in the foot with C. In C++ it’s harder to shoot yourself in the foot, but when you do, you blow off your whole leg.
There are endless articles about the power of C++ and how it makes it difficult to use. I think for large projects that are not performance-critical a language like Java might be more beneficial.
On a side note, my favourite language these days is Python. Obviously, it’s ten times slower than C++, but can get the work done with 100 lines written in 10 minutes.
Learning a New Language: Python
Jun 15th
I have used Perl for my scripting needs for a long time. The first time, I used it, I was totally blown away by its power, and flexibility. The regular expressions, dynamic typing, endless CPAN modules, and of course the geeky (read phreaky) syntax. Later, I got enamoured with it, learnt more advanced features, and wrote a few articles for the Linux Gazette as well.
Then, I wrote a web statistics package in perl. It started with building an innocent counter for my web-site and developed into a full-blown web stats package. I also wrote a some-what complicated set of scripts to build my web-site. Though, I liked the “More than one way to do it” paradigm, the syntax sometimes irked me. Some of the features (like OOP, those funky references) always looked like an after thought rather than good design.
I still use Perl for all my scripting needs, but It’s always a pain to write big piece of software in Perl and maintain it peacefully.
Here in HP, in my group, most of the scripting (for setting up testbeds, running benchmarks) are written in Python. I thought this is a good time to learn a new language. It took me a day to read through the excellent Learning Python book, and I am off to some coding. It’s a refreshing change, especially for someone coming from perl background. The best way to sum it up is that It’s as powerful as Perl with out all the wierd baggage.
I have only scratched the surface, but I am very impressed with its syntax and semantics. Most of the features are very natural, and the consistent semantics for data structures is cool. It has all the usual scripting baggage: dynamic typing, quick coding, extensive libraries, regular expressions etc.
Let’s see a cool, some-what advanced feature called mapping. This feature makes Python a little closer to functional programming languages. Often, when you have large data structures, one thing you want to do is to iterate over all the entries, and do something with each of the entries. Usual way is to use a for loop to go over all the entries. For ex. to increment each number in a list by 2
mylist = range(1, 10) for i in mylist: mylist[i] = mylist[i] + 2
Now, this is all good, but what if you want a different operation, or a complex operation. Naturally, you would think of a function that can iterate over all the entries. There you go !
def add(i):
return i + 2
mylist = range(1, 10)
print map(add, mylist)
How easy is that ! For those who don’t know Python syntax, def creates a function, and the map keyword allows you to apply a function to all entries in a list or a tuple. For more fun, read about List Comprehension and a great use of it to create SQL one-liners.
Network Programming in the Kernel – my Linux Journal Article
Aug 29th
Ok, it’s official. Grab the latest copy of LJ to read our article about network programming in the kernel. Ravi and I explained in pretty good detail on how to create a network connection and read/write data from sockets in the kernel-mode. The sample code shows a basic FTP client that connects to a given IP address and downloads a file. The code can be downloaded from LJ FTP site.
There is, ofcourse always a debate on whether this should be done in the kernel, but I think there are situations, where you may want to do this. I have explained a few reasons in the kernelnewbies thread.
I think the article will be freely available online in LJ archives after a few months. Until then, you can devour the code at LJ download site, if you are not subscribed to LJ.
Managing Patches with Quilt
Aug 20th
Quilt is a handy script to manage a series of patches. This is especially useful, if you are maintaining a large body of patches that need to be applied in a particular order. Quilt can also be used to create a series of patches that can easily be included in a software. Linux kernel people recommend Quilt for creating patches and they want your changes to be in a series of easily-manageable patches. LFS in its current state stands at more than 4000 lines of code, and people will be annoyed if I submit LFS as a single blob patch.
However, there are no clear instructions on how to use Quilt for doing this. Well, Quilt documentation is well written, but it’s written from the perspective of a patch applier/manager rather than a patch generator. Here’s a simple tutorial for generating patches using Quilt.
The idea is to get a series of manageable patches out of your changes to a big code base. Let’s say you want to add something to the Linux kernel
- Get the original code and put it in a directory say
linux/ - First decide what patches should contain what changes. You should also decide the ordering, if that matters. Say you divide your changes into two patches
fs-makefile.diff # fs/Makefile, fs/Kconfig changes cfiles.diff # fs/lfs/*.c (new files)
- Cd to the codebase directory, and Create the first patch file using Quilt
quilt new fs-makefile.diff
- Add the files you are going to change
quilt add fs/Makefile fs/Kconfig
Note that this doesn’t really add the file to the patch. It only tells the Quilt system to manage it. Any changes you make to those files from now on will appear in the patches. Now, add/copy your modifications to these files.
- Do a
quilt refreshto actually create the patch. The patch is created in thepatches/directory - If the file you are adding is a new file, you would do the same thing. Since the file doesn’t exist in the tree yet, you have to explicitly name all the files. This is a little painful, if you have a lot of files to add. You would do something like
quilt new hfiles.diff quilt add fs/lfs/ifile.h fs/lfs/lfs.c ...
- After adding all the new files to the Quilt system, copy all your changes (new files) to the appropriate directories. Do a
quilt refresh, and all your patches are created in thepatches/directory - Now onwards any changes to those files are tracked by quilt. All you have to do is a simple
quilt refresh
There are a ton of useful options to manipulate the order, contents of a series of patches. Read the PDF documentaion for more details.
Quilt can also be used to post the patches to a mailing list. README.mail file in the quilt documentation directory (doc/) has the details.
Using Gotos – Advantages and Disadvantages
Aug 9th
Recently, there was a post on The Daily WTF about gotos, and some people seem to think that gotos should be avoided like plague. Someone asked about gotos in kernel code on kernel newbies mailing list, and I guess it’s time to clear up the usage.
Why goto is bad
Well, this has been discussed at length by the great Edsger W. Dijkstra in his classic article Go To Statement Considered Harmful. Simply put, when you can use a different construct (like if, switch-case etc.), avoid using goto. It’s too easy to write sphagetti code with gotos (especially for beginners)
Some computer programming books go to the extent of banning the goto, which I think is wrong. For that matter, constructs like if, while etc. actually have a goto (jump) when they get translated to the machine code.
When it is OK to goto
Now, the question is: when is it ok to use goto or even when is it right to do so? There are many views about this, my opinion is: Use it, when
- Code readability improves – For example,
breaking from a deeply nested loop. - Performance is more important – For example,
gotos can be used as poor man’s exception handlers.
Linux kernel heavily uses gotos, and a simple example suffices to show its usage.
do A
if (error)
goto out_a;
do B
if (error)
goto out_b;
do C
if (error)
goto out_c;
goto out;
out_c:
undo C
out_b:
undo B
out_a:
undo A
out:
return ret;
This example along with nice explanations about goto usage from the kernel gurus can be found on kerneltrap. Finally, I like the following quote mentioned in the WTF post
The apprentice uses it without thinking. The journeyman avoids it without thinking. The master uses it thoughtfully.
Securing My Linux Box
Jul 28th
I recently experienced a lot of port scans on my Linux box, and decided to tighten the firewall. So, I started reading the available documentation, but somehow I was not able achieve exactly what I wanted. I guess I am IPTables challenged
I asked my friend Ravi, an IPTables expert to help me. I basically wanted this.
- Allow all connections within
umich.eduandeecs.umich.edudomains, and my private subnet10.10.0.0 - Allow ssh connections from a given IP address and block everything else.
Simple, isn’t it? He set it up for me in half an hour, and here are the rules, with brief explanations.
- Accept everything from umich.edu, ecs.umich.edu domains and 10.10.0.0 subnet. I actually wanted to do this with real names rather than IP addresses, but he told me that requires a patch.
$IPT -A INPUT -s 141.211.0.0/16 -j ACCEPT $IPT -A INPUT -s 141.213.0.0/16 -j ACCEPT $IPT -A INPUT -d 10.10.0.0/16 -j ACCEPT
- The comments say it all
# Accept TCP established and related connections $IPT -A INPUT -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT # For outgoing stuff $IPT -A OUTPUT -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT
- Do the same for UDP
$IPT -A INPUT -p udp -m state --state ESTABLISHED,RELATED -j ACCEPT
- Allow SSH from a given IP
$IPT -A INPUT -s <Home Machine> -p tcp --dport ssh -j ACCEPT
- Drop everythinge else
$IPT -A INPUT -j DROP
- After this, the machine seemed to be not accepting connections from outside, but one wierd thing happened. I was running a few daemons on the local machine, and they were not able to contact each other using sockets. Ha, the problem is that you have to explicitly add a rule for localhost
$IPT -A INPUT -s localhost -j ACCEPT
Note that this has to be added before the drop rule
Hope this small primer helps somebody like me to secure their machine.