Wednesday, September 14, 2011

Ext4 File System - Features and Setup


The fourth extended file system was developed as the successor of the commonly used ext3 journaled file system. The ext4 file system has significant advantages over the ext3 and ext2 file systems.
Support for the ext4 file system has been available from the Linux kernel version 2.6.19 onwards and was officially declared stable in the Linux kernel version 2.6.28. All the latest versions of distributions like Ubuntu (9.04), Fedora (11) etc. include the ext4 file system.
The ext4 file system includes the “extent” approach to block management features used in JFS, and the delayed allocation feature of XFS.

The ext4 file system has major improvements in terms of performance, scalability, and reliability.
The following are a few notable features of ext4:

1. File System Size

Ext4 permits filesystems up to 1 exbibyte (2^60 bytes) size and files up to 16 tebibytes size (16 * 2^40 bytes). Whereas, the ext3 file system supports only a maximum filesystem size of 16 TB and a maximum file size of 2 TB. Another advantage of ext4 is its support for the creation of enormous number of sub directories under a directory. More than 64000 sub directories can be created under a directory(unlimited) in ext4, compared to 32000 in ext3.

2. Extents

The commonly used file systems of Unix/Linux like ext2 and ext3 use direct, indirect, double indirect, and triple indirect block mapping scheme to map file offsets to on-disk blocks. This is ideal for small files. However, in the case of large files, there will be a large number of mappings which will reduce performance, especially while seeking and deleting data. So extents were introduced to replace the block mapping scheme.
The concept extent means “a contiguous sequence of physical blocks”. Large files are split into several “extents”. The files are allocated to a ’single extent’ instead of a particular size, thus avoiding the indirect mapping of blocks. The inode stores up to 4 extents of a file and indexes the rest in an Htree (This is turned on by default in ext4). Thus extents allow less fragmentation (due to a sequential block allocation) and improves performance.

3. Delayed and Multiblock allocation

In ext4 file systems, multiblock allocation (mballoc) allocates multiple blocks for a file in a single operation, instead of allocating it one by one, as is the case in ext3. This will reduce the overhead of calling the “block allocator” several times and will optimize the allocation of memory.
In delayed allocation, if a function writes data onto a disk instead of allocating it at once, it will get stored in the cache. All data in the cache will be written only after “flush”-ing the cache. This technique is called “allocate-on-flush”. Thus the block allocator gets an opportunity to optimize the allocation using the “extents” concept.
The above mentioned techniques will reduce disk fragmentation.

4. Online defragmentation and fsck speed

The fragmentation rate is lesser in ext4 systems due to the utilization of techniques mentioned above. However, that does not mean zero percent fragmentation. Defragmentation, when required, can be done online using the tool “e4defrag”.
To do this, you will need to patch the kernel with some experimental features, which are available at:
The syntax is as follows:
e4defrag "path to file" : For defragmenting a file 
e4defrag -r "directory/" : For defragmenting a directory
e4defrag /dev/sda1 : For defragmenting a partition
File system check (fsck) will normally take a lot of time to complete, especially in pass1 of e2fsck. To speed up the process in ext4, the inode table is updated with a list of unused blocks. This will allow the operation to skip these blocks while performing the check. In order to generate the list of unused blocks, you must run fsck atleast once.

5. Journal check summing

Ext4 uses the checksum of the journal to find out health of the journal blocks. This is used to avoid massive data corruption. You can turn off the journaling mode in ext4, if it causes overhead.

6. Persistent preallocation

Persistent preallocation allows the application to allocate contiguous blocks with a fixed size, before writing the data. This will ensure the following:
  • Lesser fragmentation (because blocks are allocated as contiguously as possible).
  • Ensure that applications have enough space to work.
This feature is appropriate for real time applications, databases and content streaming.

7. Inodes / Timestamps

The ext4 filesystem has a large inode size of 256 bytes by default whereas ext3 has only 128 bytes for inodes. The accuracy of the time stamp (Eg:- the modified time of a file) stored, is nano seconds instead of seconds in the case of ext3.

8. Backward compatibility

Ext3 file systems can be migrated to ext4 easily without formatting or reinstalling the OS, provided the kernel supports the ext4 file system.

Compiling the kernel with ext4 support

You can download the latest kernel from
http://www.kernel.org/pub/linux/kernel/v2.6/ and then enable ext4 support during “make menuconfig“.
Check out the steps in brief:
$ cp /boot/config- .config 
$ make oldconfig
$ make menuconfig [Select ext4 under File systems]
The parameters in the config file are:
[root@bob linux-2.6.30]# cat .config | grep EXT4_FS 
CONFIG_EXT4_FS=m
CONFIG_EXT4_FS_XATTR=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
You can then compile the kernel with ‘make’ and install modules using ‘make modules_install’. The boot loader entries should then be edited followed by a machine reboot, to load the new kernel.
Refer http://bobcares.com/blog/?p=162 for more information about kernel recompilation.
The next step is to compile the latest stable version of e2fsprogs. This is available at: 
http://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/.
$ ./configure 
$ make && make install

Creating a new ext4 file system

This can be done using the command mkfs.ext4 [similar to mkfs.ext3]:
$ mkfs.ext4 /dev/sda1
OR
$ mke2fs -t ext4 /dev/sda1, where sda1 is the block device.
In order to mount the file system, use ‘-t ext4′ in the mount command:
$ mount -t ext4 /dev/sda1 /home1
To convert an ext3 file system to ext4, do the following:
Since ext4 is backward compatible you can simply mount the ext3 file system as ext4:

Make sure that you have a backup of the data before executing these commands.
$ mount -t ext4 /dev/sda2 /home2, where /dev/sda2 contains an ext3 file system.
You also have to tell the partition ‘/dev/sda2′ to use ext4 features since it was not formatted with ext4. To do this run:
$ tune2fs -O extents,uninit_bg,dir_index /dev/sda2
where extents will enable the “extents” feature, uninit_bg will enable the group checksums and dir_index will enable the “Htree indexes.”
Now, fsck has to be run on the partition, to check whether the settings altered by tune2fs are proper:
$ e2fsck -fpDC0 /dev/sda2, where f is for force checking, p is for automatic repair, D is for optimize directories and C0 is for printing a completion bar as it goes.
Ext2 file systems need to be converted to ext3 first using ‘tune2fs -j /dev/sda3‘ before being converted to ext4. This will enable the journal feature.

Limitations

After converting an existing file system to ext4, it will no longer be mountable as ext3 or ext2. Also, since the disk is not being formatted with the new file system, the ‘extents’ feature will not be available for existing files.
Another important thing to be noted is the ‘/boot/’ partition. If the machine does not have a separate partition for ‘/boot/’, converting the file system to ext4 will cause issues, since older versions of grub do not support ext4. A Fedora 11 fresh installation will not allow formatting of ‘/boot/’ with ext4. You will have to use ext3 instead. A newer version of grub [grub2 packages] is available with ext4 support, however it is not officially supported yet.

Conclusion

After the release of ext4 with the latest kernels, all new Linux distributions are switching the default file system to the faster ext4. The benchmark test conducted using the phoronix test suite for different file systems is available at: http://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/.
This file system will become the standard file system for Linux very soon.

References