Friday, February 17, 2012

About initrd

The Linux kernel, as in other Unix systems, is loaded into memory during the operating system boot process, and remains resident throughout the operation of the system, as do application programs ? software which executes in "userspace", under the control of the kernel.

In order to minimise the amount of code which is loaded into memory, and for maximum modularity, the Linux kernel omits much code that is necessary to load the operating system. Some of this code is located in modules loaded into kernelspace, known as kernel modules; the rest is userspace applications.

In order to make a boot system which can apply to a range of hardware, or which can be loaded from a range of different media including virtual, transient media such as that provided a network connection, it is frequently necessary for the kernel to access userspace code and perhaps kernel modules, in order to gain the data and routines necessary to access the main data store. This is not necessarily the case with every particular configuration, but is frequently the case.

The initrd boot system has been the primary solution of this problem, for as long as anyone can remember [1], although recently developed initramfs introduced in kernel 2.6 makes significant improvements. [2]

In the initrd system, files to be accessed by the kernel at boot-time are stored on a ramdisk, whose contents are found in a filesystem which has been made on either a loop mounted file, or more historically, on a small mountable device such as a floppy disk, and is usually between 1.4 MB and 4 MB in size. The loop-mount filesystem is compressed with gzip. The location of this ramdisk image is provided to the kernel at load-time, by the boot loader (usually either LILO or GRUB).

The initrd system employs several "kludges" and has some drawbacks, from the point of view of an administrator. Creating or editing a ramdisk filesystem image requires root privileges, in order to loop-mount the filesystem image to make changes, and/or to initialise a filesystem structure (format) on the virtual-drive. Furthermore, the filesystem used on the ramdisk image may be one that would not be used in the kernel otherwise, and that the code to access the filesystem must be programmed into the kernel, meaning that it cannot be later unloaded in the fashion which is possible where code is loaded from kernel modules.

Ramdisks are a fixed size ? so they typically take up more space than is needed, and yet they limit the amount of space which is available as working space once the system is booted. In fact the initrd approach disallows unloading of part of the memory used by the initrd at all, without rebooting.

However, despite these issues, since the system has had almost universal application for some years, it is still in wide use.

Initramfs in comparison with initrd

In comparison, initramfs is a more convenient and simpler system for administrators to manage than the previous ramdisk-based system, since the external code housed in the initrd can easily be edited with non-privileged operations, and since less indirection is involved: there is no need to make a virtual drive, format the virtual drive, and provide the kernel with filesystem capabilities beyond the minimal requirements to read a compressed cpio archive.[1]

Since the original ramdisk-based system was popularised, a newer more flexible, ram-based filesystem known as tmpfs or shmfs, has become a standard component of the kernel. This system is in many ways much more flexible and efficient than the original fixed-size ramdisk: it does not require formatting, and uses as little or as much memory as is required to hold the data.

Another similar system, ramfs has been used, giving the initramfs system its name, and currently users may choose which particular dynamic RAM filesystem to use.[2]

Along with the code for tmpfs, which is used in the kernel of nearly all Linux configurations, the only requirement to make a suitable virtual drive for booting to, was to add the ability to decompress an archive of files (the kernel developers chose to use the cpio archive format). The decompressed files are stored to a tmpfs-like filesystem. This system is known as initramfs.[3]

Uses

As well as providing for loading of necessary code preparatory to booting code from a fixed disk, a ramdisk (either initrd or initramfs) may be useful as a rescue disk, for use in applying security updates, backing up files, conducting forensics or debugging hardware problems, or to obviate a hard disk, perhaps in order to provide network-stored OS images, or in order to boot a slow medium such as a CD-ROM.

Many Linux distributions ship a single, generic kernel image that is intended to boot as wide a variety of hardware as possible. The drivers included with this generic kernel image must be modular, as it is not possible to statically compile everything into the one kernel without making it too large to boot from computers with limited memory or from lower-capacity media like floppy disks.

This then raises the problem of detecting and loading the modules necessary to mount the root file system at boot time (or, for that matter, deducing where or what the root file system is).

To further complicate matters, the root file system may be on a software RAID volume, LVM, a network file system of some sort (NFS is common on diskless computers) or on an encrypted partition. All of these require special preparations to mount.

End-user implementation

The kernel image and initrd image must both be stored somewhere accessible by the boot firmware of the computer or the Linux bootloader. On PCs, this is usually:

    * The root file system itself
    * A small ext2 or FAT-formatted partition on a local disk (a boot partition)
    * A TFTP server (on systems that can boot from Ethernet)

The bootloader will load the kernel and initrd image into memory and then start the kernel, passing in the memory address of the initrd. At the end of its boot sequence, the kernel tries to determine the format of the image from its first few blocks of data:

    * If the image is a (optionally gzip-compressed) file system image, then it will be made available as a special block device (/dev/ram), which is then mounted as the initial root file system. The driver for that file system must be compiled statically into the kernel. Many distributions originally used compressed ext2 file systems as initrd images. Others (including Debian 3.1) used cramfs in order to boot on memory-limited systems, since the cramfs image can be mounted in-place without requiring extra space for decompression.

    Once the initial root file system is up, the kernel executes "/linuxrc" as its first process. When it exits, the kernel assumes that "/linuxrc" has mounted the real root file system and executes "/sbin/init" to begin the normal user-space boot process.

    * If the image is a gzip-compressed cpio archive, it will be unpacked by the kernel in an intermediate step into a tmpfs, which then becomes the initial root file system. This scheme has been dubbed initramfs and is available on Linux 2.6.13 onwards. It has the advantage of not requiring an intermediate file system to be compiled into the kernel.

    On an initramfs, the kernel executes "/init" as its first process. "/init" is not expected to exit.

Some Linux distributions will generate a customized initrd image which contains only whatever is necessary to boot that particular computer, such as ATA, SCSI and filesystem kernel modules. These typically embed the location and type of the root file system.

Other distributions (such as Fedora and Ubuntu) generate a more generic initrd image. These start only with the device name of the root file system (or its UUID) and must discover everything else at boot time. In this case, a complex cascade of tasks must be performed to get the root file system mounted:

    * Any hardware drivers that the boot process depends on must be loaded. A common arrangement is to pack kernel modules common storage devices onto the initrd and then invoke a hotplug agent to pull in modules matching the computer's detected hardware.
    * If the root file system is on NFS:
          o Bring up the primary network interface.
          o Invoke a DHCP client, with which it can obtain a DHCP lease.
          o Extract the address of the NFS server from the lease.
          o Mount the NFS share.
    * If the root file system appears to be on a software RAID device, there is no way of knowing which devices the RAID volume spans; the standard MD utilities must be invoked to scan all available block devices and bring the required one online.
    * If the root file system appears to be on a logical volume, the LVM utilities must be invoked to scan for and activate the volume group containing it.
    * If the root file system is on an encrypted block device:
          o Invoke a helper script to prompt the user to type in a passphrase and/or insert a hardware token (such as a smart card or a USB security dongle).
          o Create a decryption target with the device mapper.
    * Perform any maintenance tasks which cannot otherwise be safely done on a mounted root file system.
    * Mount the root file system read-only.

The final root file system cannot simply be mounted over "/", since that would make the scripts and tools on the initial root file system inaccessible for any final cleanup tasks. Instead, it is mounted at a temporary mount point and rotated into place with pivot_root(8) (which was introduced specifically for this purpose). This leaves the initial root file system at a mount point (such as "/initrd") where normal boot scripts can later unmount it to free up memory held by the initrd.

Most initial root file systems implement "/linuxrc" or "/init" as a shell script and thus include a minimal shell (usually /bin/ash) along with some essential user-space utilities (usually the BusyBox toolkit). To further save space, the shell, utilities and their supporting libraries are typically compiled with space optimizations enabled (such as with gcc's "-Os" flag) and linked against a stripped-down version of the C library such as dietlibc or klibc.

Some distributions (notably, SUSE Linux and Ubuntu) further use the initrd to paint a bootsplash animation onto the display early on in the boot process.