Posts tagged Linux
I am slowly picking up my old habits of wading through tons of junk http://www.slashdot.org/, and glean small nuggets of wisdom. I stumbled upon the Net Channels article, and I think it’s one of the most interesting new ideas I have heard in a while out of academia. Googling points to more relevant web pages with more details : on LWN, and DaveM’s blog.
Van’s slides are the only information we have right now, and according to DaveM’s post on net-dev list, he seems to have vanished without providing more details.
The basic idea is some-what easy to understand, but all the implications of this paradigm may not be evident on the first glance. I will pour in a few thoughts about it below:
Cache Behaviour: Because of ultra-fast CPUs, main memory cannot keep up with CPU, and slows the performance down. To improve performance, we want to keep as much data as possible in the cache, while it is being accesses. However, this is not always possible, and some of the things that cause problems are:
- Context switches: Switching between processes or between kernel and use-mode, often causes poor data locality in the cache.
- Switching between processor: is even more expensive. For example, because the process or kernel tasklet is suddenly on the other cpu, and doesn’t have any relevant data in the cache.
- Huge data structures manipulated at different places by different tasks.
- Locking: I don’t completely understand how cache gets affected by locking.
So, Van’s idea is to have a circular buffer for processing network packets that are accessed in a producer/consumer paradigm. There’s some code written by DaveM starting the implementation of Net Channels, but otherwise very little work has been done to implement it in Linux.
A friend of mine recently asked me about finding the absolute command path of a process given the pid. Ofcourse, it’s very easy to do it for a particular OS. Doing it platform independently is actually a little tough. There’s no straight-forward POSIXy way of doing this (as far as I know). One can certainly do some /proc magic, but that won’t be portable. My suggestion was to just use
ps. This works for both Solaris and Linux. So, you get the output from
ps -p <pid> -f UID PID PPID C STIME TTY TIME CMD draganm 17198 17193 0 20:40:40 ? 0:00 csh -c /usr/lib/ssh/sftp-server
and simply do some string manipulation to get the required string (shown in bold above). Any other ideas folks?
Don’t ever kill rpm/yum, while it’s running, especially when it’s doing that rpmdb transaction. rpm won’t release the lock and leaves some temporary files probably related to the transaction. How dumb? How difficult is it to write a signal handler, that cleans up the crap? I ran into this problem quite often. I start rpm or yum and do an update or something, and then it freezes forever. After losing my patience, I kill it, and I can’t use the rpm tools anymore. The fix: simple, use brute-force
rm -rf /var/lib/rpm/__db* rpm --rebuilddb
Ok, it’s official. Grab the latest copy of LJ to read our article about network programming in the kernel. Ravi and I explained in pretty good detail on how to create a network connection and read/write data from sockets in the kernel-mode. The sample code shows a basic FTP client that connects to a given IP address and downloads a file. The code can be downloaded from LJ FTP site.
There is, ofcourse always a debate on whether this should be done in the kernel, but I think there are situations, where you may want to do this. I have explained a few reasons in the kernelnewbies thread.
I think the article will be freely available online in LJ archives after a few months. Until then, you can devour the code at LJ download site, if you are not subscribed to LJ.
TLDP, if you don’t know already is The Linux Documentation Project. It’s one of my favourite OpenSource projects, as I grew up with it. In the dark ages, when we had to muddle with
XF86Config to get X working, TLDP had some cool HOWTOs to help newbies. I fondly remember the days when I scrolled through the HOWTOs in Lynx. I have learnt a lot by reading the HOWTO/Guides, and it compelled me to contribute a HOWTO for NCURSES.
Ok, back to the topic. Recently, there was a discussion about the lack of author response, and outdated HOWTOs on the TLDP discuss mailing list. As always, somebody proposed to setup a Wiki to correct all the problems, and had the audacity to say tldp might be dying. Natually, this started a flame war with others pointing out that this has been discussed at lenght many times, and it’s tough to find solutions. I have contributed my share of the flames and posted a summary of the Wiki war. Stein Gjoen also did a review of Wiki, and concluded that Wiki is not yet ready for TLDP.
In my opinion, TLDP is a great resource, and even though some HOWTOs are painfully out-dated, there is still a lot of documentation that is of high-quality and well-maintained. Having a Wiki might help to bring a few HOWTOs back from dead as mentioned on the discussion list, but it won’t be a replacement for current state of affairs.
Quilt is a handy script to manage a series of patches. This is especially useful, if you are maintaining a large body of patches that need to be applied in a particular order. Quilt can also be used to create a series of patches that can easily be included in a software. Linux kernel people recommend Quilt for creating patches and they want your changes to be in a series of easily-manageable patches. LFS in its current state stands at more than 4000 lines of code, and people will be annoyed if I submit LFS as a single blob patch.
However, there are no clear instructions on how to use Quilt for doing this. Well, Quilt documentation is well written, but it’s written from the perspective of a patch applier/manager rather than a patch generator. Here’s a simple tutorial for generating patches using Quilt.
The idea is to get a series of manageable patches out of your changes to a big code base. Let’s say you want to add something to the Linux kernel
- Get the original code and put it in a directory say
- First decide what patches should contain what changes. You should also decide the ordering, if that matters. Say you divide your changes into two patches
fs-makefile.diff # fs/Makefile, fs/Kconfig changes cfiles.diff # fs/lfs/*.c (new files)
- Cd to the codebase directory, and Create the first patch file using Quilt
quilt new fs-makefile.diff
- Add the files you are going to change
quilt add fs/Makefile fs/Kconfig
Note that this doesn’t really add the file to the patch. It only tells the Quilt system to manage it. Any changes you make to those files from now on will appear in the patches. Now, add/copy your modifications to these files.
- Do a
quilt refreshto actually create the patch. The patch is created in the
- If the file you are adding is a new file, you would do the same thing. Since the file doesn’t exist in the tree yet, you have to explicitly name all the files. This is a little painful, if you have a lot of files to add. You would do something like
quilt new hfiles.diff quilt add fs/lfs/ifile.h fs/lfs/lfs.c ...
- After adding all the new files to the Quilt system, copy all your changes (new files) to the appropriate directories. Do a
quilt refresh, and all your patches are created in the
- Now onwards any changes to those files are tracked by quilt. All you have to do is a simple
There are a ton of useful options to manipulate the order, contents of a series of patches. Read the PDF documentaion for more details.
Quilt can also be used to post the patches to a mailing list.
README.mail file in the quilt documentation directory (
doc/) has the details.
Everyday, you learn a new way to crash your computer I ran into the error shown in the left image, while meddling with
kmalloc. Nice to see VMWare catch the stack fault, and show a helpful message. BTW, I have gone over 200 Oops, since I started hacking on logfs. Kernel hacking would be totally frustrating without the amazing VMWare.
At times, you want to find the rpm that installs a particular binary. I was trying to find the rpm for latex, and was doing
yum list "latex*" etc. Well, there’s a simple way to find out
rpm -qif <file name>
This will tell you the rpm that installed the binary. BTW, latex comes bundled in
tetex-latex rpm. There are bunch of other cool options to rpm. Check the man page for details.
Recently, there was a post on The Daily WTF about gotos, and some people seem to think that gotos should be avoided like plague. Someone asked about gotos in kernel code on kernel newbies mailing list, and I guess it’s time to clear up the usage.
Why goto is bad
Well, this has been discussed at length by the great Edsger W. Dijkstra in his classic article Go To Statement Considered Harmful. Simply put, when you can use a different construct (like
switch-case etc.), avoid using
goto. It’s too easy to write sphagetti code with gotos (especially for beginners)
Some computer programming books go to the extent of banning the
goto, which I think is wrong. For that matter, constructs like
while etc. actually have a goto (jump) when they get translated to the machine code.
When it is OK to goto
Now, the question is: when is it ok to use
goto or even when is it right to do so? There are many views about this, my opinion is: Use it, when
- Code readability improves – For example,
breaking from a deeply nested loop.
- Performance is more important – For example,
gotos can be used as poor man’s exception handlers.
Linux kernel heavily uses gotos, and a simple example suffices to show its usage.
do A if (error) goto out_a; do B if (error) goto out_b; do C if (error) goto out_c; goto out; out_c: undo C out_b: undo B out_a: undo A out: return ret;
This example along with nice explanations about
goto usage from the kernel gurus can be found on kerneltrap. Finally, I like the following quote mentioned in the WTF post
The apprentice uses it without thinking. The journeyman avoids it without thinking. The master uses it thoughtfully.
If you run
ldd on a program compiled on 2.6 kernels, you would see something like
[ppadala@linf2 ~]$ ldd kabru/progs/bin/prime linux-gate.so.1 => (0x0090a000) libm.so.6 => /lib/libm.so.6 (0x00c75000) libc.so.6 => /lib/libc.so.6 (0x00a81000) /lib/ld-linux.so.2 (0x00bad000)
So, what’s this linux-gate.so? and why doesn’t it have a corresponding link? There’s no real library called linux-gate, it’s a library page given to the application by the kernel that contains the
vsyscall entry points. Long ago, Linus implemented vsyscalls, to improve the performance of system calls. The gory details are in the i386 architecture directory.