Virtually the best blog on the web!
Google Summer of Code 2005
Google SoC List of Projects
Oct 26th
SoC End, Beginning of Another Cave Life :-)
Sep 1st
It was a great experience taking part in the Google summer of code program. I have learnt a lot and get to add the title kernel hacker to my name
For those who don’t follow my blog regularly, I worked on a log-structured file system for Linux that supports snapshots. All the details, documentation, code etc. are available on the sourceforge website.
Now, my prelims exams are coming up in next few weeks, and I have a ton of research stuff to do. I am supposed to present a research idea and support it with some preliminary experimental results. I am working on a fault-tolerant scheduler for large-scale systems, and will post a few details on my blog very soon. It’s time to move to an inner cave.
Cheers !
Snapshots: The Final Piece in the Puzzle
Aug 23rd
I have checked in a basic framework for maintaining snapshots in LFS, and this pretty much completes the SoC project as proposed (Woohoo !!!). I have sent a mail explaining the snapshot design to the logfs-devel list.
I was initially skeptical about the amount of work I could accomplish in two months, but everything pretty much went according to the plan. I have also done some preliminary performance tests, which can be seen here. A mail explaining the results can be read here
Cheers !
Kernel Stack Fault
Aug 17th
Everyday, you learn a new way to crash your computer
I ran into the error shown in the left image, while meddling with kmalloc. Nice to see VMWare catch the stack fault, and show a helpful message. BTW, I have gone over 200 Oops, since I started hacking on logfs. Kernel hacking would be totally frustrating without the amazing VMWare.

LFS Stable Version and Mailing List
Aug 11th
I have updated the stable version in CVS, with the latest and greatest of LFS. This stable version is pretty much a complete log structured file system minus the cleaner. I have a cleaner in the main trunk of the CVS, and if you are feeling lucky and adventurous try it out. It also requires no patches to the kernel, and should compile cleanly on 2.6.11. Follow the instructions on the LFS website for compiling and running LFS.
I have also created a mailing list for LFS development, and if you are interested in following the development, please subscribe here
Out of Input Data — System Halted
Jul 26th
I run my development kernel in VMWare, so whenever I make changes to the kernel, I have to copy the new bzImage to the host machine. Unfortunately, I have to reboot the host machine for grub to pick up the new kernel. I forgot to do this once and just scpd the bzImage to host machine and rebooted the virtual machine. Ha, I was greeted with above error message.
It’s pretty clear that the compressed kernel image is bad. Just for fun, I followed the source and this code appears in gunzip, which is called by decompress_kernel. This error occurs when data underrun happens. gunzip eventually calls error, which just halts the machine with a while(1) ; Another function that is called when all hell breaks loose is panic. These functions should be called only under fatal unrecoverable errors.
A good explanation of kernel booting on x86 is available here.
LFS Stable Tag and Official Web Site
Jul 24th
LFS is relatively stable now and one can do various normal file system operations like mkdir, link, touch etc. I also made a web page for the official logfs website on sourceforge. The code is available through CVS and here are the quick instructions on how to get it.
cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/logfs login cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/logfs co -r stable -P lfs
The code cleanly compiles on 2.6.7 and may not compile on other 2.6.X kernels. I am going to move the development to the latest kernel very soon. I wish they had kgdb patches for the latest kernel.
If you are interested in testing it, drop me a note.
Disclaimer: The file system is still experimental and may eat up your disk/memory and/or lock up your machine. I am not responsible for any damage you might incur. That said, it probably would only cause damage to the LFS partition.
Kernel Lockups
Jul 20th
I was experiencing kernel lockups, when I created a lot of files in LFS. It was quite frustrating, because even the magic SysRq was not working. I decided to use kgdb and started compiling 2.6.7 (latest version with which kgdb works). I also enabled various kernel debugging facilities including the slab cache debugging.
Booted with the new kernel, mounted my LFS partition and started creating a few files. Immediately, I got a slew of “Slab corruption” errors.
Slab corruption: start=c3670c54, len=432 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](0x0) 120: 6b 6b 6b 6b 00 00 00 00 00 00 00 00 00 00 00 00 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Prev obj: start=c3670a98, len=432 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c3670e10, len=432 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](alloc_inode+0x14f/0x190) 000: 00 00 00 00 00 89 ee cf 64 95 fc cf f4 7a c1 cf 010: f0 59 22 ce f0 59 22 ce 02 00 00 00 01 00 00 00
After a fair amount of searching and reading the source, I understood that these errors happen when uninitialized memory is accessed. It was unclear where exactly the memory corruption was happening. After a few hours of looking through each allocation, I finally found the problem. I forgot to allocate memory for a few data structures that I was accessing.
The code that checks the memory accesses is in mm/slab.c. When a slab is created, its objects are initialized with POISON_FREE byte (0x6b). This is done in cache_init_objs. At object allocation, this byte pattern is checked and if the object doesn’t contain POISON_FREE, someone must have written over this uninitialized memory causing the slab corruption errors. The checking is done in check_poison_obj. CONFIG_SLAB_DEBUG also enables RED ZONES that allow you to check for buffer overwriting. These debugging features certainly slow down the system and shouldn’t be used in a production system.
Interesting thing is that I didn’t get these errors on a vanilla FC3 kernel (while creating a few files), because the vanilla kernel disables CONFIG_DEBUG_SLAB and slab cache silently ignores the memory overwrites.
VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day …
Jul 18th
Today, I ran into the above amusing error, while working on LFS. The error occurs when the usage counter inode->i_count is not zero, while the file system is being unmounted. This usually means a programmer error where the usage counter was incremented, but was not decremented after accessing the inode data structure. I was creating an inode for inode map (in my LFS inode map itself is a file) during mount. The inode is created by calling iget. iget increments the usage counter and expects the caller to decrement it. However, my problem is that I cannot simply release the imap inode before every other inode is cleaned up. I initially thought of modifying the sb->delete_inode that gets called by generic_delete_inode. After following the generic_shutdown_super code, I realized that invalidate_inodes is called twice and before the second call put_super of corresponding file system is called. So, I finally added an iput in my lfs_put_super and everything was fine.
Usage counters are an important part of kernel code. A usage counter represents the number of references to the data structure in question. Usage counters do not replace locking. The following paragraph from Linux Coding Style document
Note that locking is _not_ a replacement for reference counting. Locking is used to keep data structures coherent, while reference counting is a memory management technique. Usually both are needed, and they are not to be confused with each other.
succintly explains it, and
Remember: if another thread can find your data structure, and you don’t have a reference count on it, you almost certainly have a bug.
Bootstrapping LFS
Jun 29th
The tax-related stuff for the SoC project has been quite tough for a lot of people. You can tell from the incessant posting of mails on Summer-Accepted list that people are really confused about this stuff. I still am not 100% sure what tax form I should send, but after closely reading the IRS forms, I have decided to send W-8BEN form (I am an F1 student studying in US). Hope to get it done tomorrow morning.
Ok, on to more interesting stuff. I finally started coding today. Wrote a basic mklfs. It creates an LFS with a root directory and the necessary IFILE entries. After running mklfs, this is how the disk looks:
___________________________________________ | | | | | | | Super | summary | ifile | ifile | root | | block | | dblock| inode | inode | |_______|_________|_______|_______|_______|
More details about the structure of the disk can be found in the README file. The kernel module can mount this file system and print a few info messages.