Virtually the best blog on the web!
Posts tagged Kernel
Network Programming in the Kernel – my Linux Journal Article
Aug 29th
Ok, it’s official. Grab the latest copy of LJ to read our article about network programming in the kernel. Ravi and I explained in pretty good detail on how to create a network connection and read/write data from sockets in the kernel-mode. The sample code shows a basic FTP client that connects to a given IP address and downloads a file. The code can be downloaded from LJ FTP site.
There is, ofcourse always a debate on whether this should be done in the kernel, but I think there are situations, where you may want to do this. I have explained a few reasons in the kernelnewbies thread.
I think the article will be freely available online in LJ archives after a few months. Until then, you can devour the code at LJ download site, if you are not subscribed to LJ.
Kernel Stack Fault
Aug 17th
Everyday, you learn a new way to crash your computer
I ran into the error shown in the left image, while meddling with kmalloc. Nice to see VMWare catch the stack fault, and show a helpful message. BTW, I have gone over 200 Oops, since I started hacking on logfs. Kernel hacking would be totally frustrating without the amazing VMWare.

Using Gotos – Advantages and Disadvantages
Aug 9th
Recently, there was a post on The Daily WTF about gotos, and some people seem to think that gotos should be avoided like plague. Someone asked about gotos in kernel code on kernel newbies mailing list, and I guess it’s time to clear up the usage.
Why goto is bad
Well, this has been discussed at length by the great Edsger W. Dijkstra in his classic article Go To Statement Considered Harmful. Simply put, when you can use a different construct (like if, switch-case etc.), avoid using goto. It’s too easy to write sphagetti code with gotos (especially for beginners)
Some computer programming books go to the extent of banning the goto, which I think is wrong. For that matter, constructs like if, while etc. actually have a goto (jump) when they get translated to the machine code.
When it is OK to goto
Now, the question is: when is it ok to use goto or even when is it right to do so? There are many views about this, my opinion is: Use it, when
- Code readability improves – For example,
breaking from a deeply nested loop. - Performance is more important – For example,
gotos can be used as poor man’s exception handlers.
Linux kernel heavily uses gotos, and a simple example suffices to show its usage.
do A
if (error)
goto out_a;
do B
if (error)
goto out_b;
do C
if (error)
goto out_c;
goto out;
out_c:
undo C
out_b:
undo B
out_a:
undo A
out:
return ret;
This example along with nice explanations about goto usage from the kernel gurus can be found on kerneltrap. Finally, I like the following quote mentioned in the WTF post
The apprentice uses it without thinking. The journeyman avoids it without thinking. The master uses it thoughtfully.
Kernel Lockups
Jul 20th
I was experiencing kernel lockups, when I created a lot of files in LFS. It was quite frustrating, because even the magic SysRq was not working. I decided to use kgdb and started compiling 2.6.7 (latest version with which kgdb works). I also enabled various kernel debugging facilities including the slab cache debugging.
Booted with the new kernel, mounted my LFS partition and started creating a few files. Immediately, I got a slew of “Slab corruption” errors.
Slab corruption: start=c3670c54, len=432 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](0x0) 120: 6b 6b 6b 6b 00 00 00 00 00 00 00 00 00 00 00 00 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Prev obj: start=c3670a98, len=432 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c3670e10, len=432 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](alloc_inode+0x14f/0x190) 000: 00 00 00 00 00 89 ee cf 64 95 fc cf f4 7a c1 cf 010: f0 59 22 ce f0 59 22 ce 02 00 00 00 01 00 00 00
After a fair amount of searching and reading the source, I understood that these errors happen when uninitialized memory is accessed. It was unclear where exactly the memory corruption was happening. After a few hours of looking through each allocation, I finally found the problem. I forgot to allocate memory for a few data structures that I was accessing.
The code that checks the memory accesses is in mm/slab.c. When a slab is created, its objects are initialized with POISON_FREE byte (0x6b). This is done in cache_init_objs. At object allocation, this byte pattern is checked and if the object doesn’t contain POISON_FREE, someone must have written over this uninitialized memory causing the slab corruption errors. The checking is done in check_poison_obj. CONFIG_SLAB_DEBUG also enables RED ZONES that allow you to check for buffer overwriting. These debugging features certainly slow down the system and shouldn’t be used in a production system.
Interesting thing is that I didn’t get these errors on a vanilla FC3 kernel (while creating a few files), because the vanilla kernel disables CONFIG_DEBUG_SLAB and slab cache silently ignores the memory overwrites.
VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day …
Jul 18th
Today, I ran into the above amusing error, while working on LFS. The error occurs when the usage counter inode->i_count is not zero, while the file system is being unmounted. This usually means a programmer error where the usage counter was incremented, but was not decremented after accessing the inode data structure. I was creating an inode for inode map (in my LFS inode map itself is a file) during mount. The inode is created by calling iget. iget increments the usage counter and expects the caller to decrement it. However, my problem is that I cannot simply release the imap inode before every other inode is cleaned up. I initially thought of modifying the sb->delete_inode that gets called by generic_delete_inode. After following the generic_shutdown_super code, I realized that invalidate_inodes is called twice and before the second call put_super of corresponding file system is called. So, I finally added an iput in my lfs_put_super and everything was fine.
Usage counters are an important part of kernel code. A usage counter represents the number of references to the data structure in question. Usage counters do not replace locking. The following paragraph from Linux Coding Style document
Note that locking is _not_ a replacement for reference counting. Locking is used to keep data structures coherent, while reference counting is a memory management technique. Usually both are needed, and they are not to be confused with each other.
succintly explains it, and
Remember: if another thread can find your data structure, and you don’t have a reference count on it, you almost certainly have a bug.