Tuesday, August 30, 2011

Why is the difference between du and df output?

# df –h /apps
/dev/mapper/datavg-lv01 70G 60G 10G /apps
# du –sh /apps
/apps 50G

If the files are deleted (by rm command) while they are being opened or used by a Linux program / process, the evil of “open file descriptor” problem arises and confuse the Linux file system on reporting the real figure of used disk space or free disk space available.
In order to resolve the fake “disk space full” problem, i.e. to reclaim “used disk space”, you need to kill or terminate the “defunct process” - in this case, the rm command that turns to be defunct process while the files are being used.
Once these defunct processes are terminated, the “open file descriptor” problem will be resolved, and both the du and df commands will agree to report the real file system used disk space or free disk space!
How to find out and terminate or kill the defunct processes that cause open file descriptor problem, in order to resolve the difference of used disk space in du and df command?
For this particular scenario, the lsof command (list open file command) is great to show light:
lsof | grep "deleted" or
lsof | grep "/apps" (rather long and messy)
and look for Linux process ID in second column of the lsof command output. The seventh column is the size of file being “deleted” (but not success and turns out to be defunct process).

How to recreate the “open file descriptor” problem that causes the difference of used disk space reported by df and du command?
  1. Create one 500MB file in my /home file system:
dd if=/dev/zero of=/home/lokams bs=1024 count=500000
  1. Run md5 checksum against the 500MB file with md5sum command:
md5sum /home/lokams
  1. Now, open another session and remove the /home/lokams file while md5sum still computing its md5 checksum:
rm /home/lokams
  1. Now, both the Linux df and du commands will report different used disk space or free disk space, that caused by “open file descriptor” problem:
df -h; du -h --max-depth=1 /home