Saturday, June 11, 2011

The story of storage: hard disk


Hard disk is a kind of storage that uses a concentric stack of disks or "platters" to record data. It is a block device, that says it reads and writes data in fixed-size blocks. Generally, the block size is 512 bytes. So from a software engineer's point of view, a hard disk is just a sequence of continuous blocks of data, and you can visit any of them freely using some kind of address mechanism.


1 MBR

A master boot record (MBR) is the first sector of a hard disk. It serves mainly two functions:
  • Holds a disk's primary partition table.
  • Holds the bootstrapping code. After BIOS initializing the PC, it will load this sector into memory and pass execution to it.
The structure of MBR is as follows:

OffsetDescriptionSize
0x0000Code area440
0x01B8Disk signature4
0x01BCUsually NULL (0x0000)2
0x01BEPrimary partition table (Fore entries, each 16 bytes)64
0x01FEMBR signature (0x55, 0xAA)2

Disk signature is used to uniquely indentify the boot disk by the OS and further by userland processes. But after the introduction of EDDdisk signature can be omitted and code areacan be extended to a length of 446.
By convention, there are exactly four primary partition table entries in the MBR Partition Table scheme. Both the partition length and partition start address are stored as 32-bit quantities. Because the block size is 512 bytes, this implies that neither the maximum size of a partition nor the maximum start address (both in bytes) can exceed 2^32 * 512 bytes, or 2 TiB.
See Partition Table, for more info.


2 CHS

Cylinder-head-sector, also known as CHS, was an early method for giving addresses to each physical block of data on a hard disk drive. Though CHS values no longer have a direct physical relationship to the data stored on disks, pseudo CHS values (which can be translated by disk electronics or software) are still being used by many utility programs.
  • Head: Data is written to or read from a platter of the hard disk by a device called head. Usually, two heads are used to manipulate the data on both surfaces of a platter.
  • Track, Sylinder: A platter surface is composed of concentric circles. They are called tracks. All information stored on a hard disk is recorded in tracks. The tracks are numbered, starting from 0, starting at the outside of the platter and increasing as you go in. All tracks that have the same number and span across each platter surface form a sylinder.
  • Sector: A track is divided into sectors that are the base units managed by a hard disk driver.
So each sector can be addressed by a three-dimensional coordinate system (CHS). The number of sectors a hard disk holds is:
cylinders * heads * sectors
In earlier hard drive designs, the number of sectors per track was fixed and because the outer tracks on a platter have a larger circumference than the inner tracks, space on the outer tracks was wasted. The number of sectors that would fit on the innermost track constrained the number of sectors per track for the entire platter. However, many of today's advanced drives use a formatting technique called Multiple Zone Recording to pack more data onto the surface of the disk. Multiple Zone Recording allows the number of sectors per track to be adjusted so more sectors are stored on the larger, outer tracks. By dividing the outer tracks into more sectors, data can be packed uniformly throughout the surface of a platter, disk surface is used more efficiently, and higher capacities can be achieved with fewer platters. Not only is effective storage capacity increased by as much as 25 percent with Multiple Zone Recording, but the disk-to-buffer transfer rate also is boosted. With more bytes per track data in the outer zones is read at a faster rate.
However, as I metioned before, CHS values no longer have a direct physical relationship to the data stored on disks, the pseudo CHS still uses a uniform schema. The total length of CHS is 24 bits. Below is the detailed limit. See Partition Table.

NameBitsStart FromEnd LimitTotal Number
Cylinder10010231024
Head80254255
Sector616363

So when use the CHS address schema, a hard disk could be no lager than:
(1024 * 255 * 63) * (512) = 8,422,686,720 bytes (about 8.4 GB)


3 LBA

Logical block addressing (LBA) is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disks. The term LBA can mean either the address or the block to which it refers. Logical blocks in modern computer systems are typically 512 or 1024 bytes each. ISO 9660 CDs (and images of them) use 2048-byte blocks. LBA is a particularly simple addressing scheme; blocks are located by an index, with the first block being LBA=0, the second LBA=1, and so on.
CHS tuples can be converted to LBA addresses using the following formula:
LBA(C,H,S) = ((C * heads_num) + H) * sectors_per_track + S - 1


4 Partition Table

As described before, the partition table in MBR can hold at most four records. Each partion can't exceed 2 TiB. To alleviat this capacity limitation, an new partition schema called GUID Partition Table (GPT) is introduced in industry. See more at UEFI.
Follows is the layout of one 16-byte partition record:

OffsetLengthDescription
0x001status (0x80 = bootable, 0x00 = non-bootable, other = invalid)
0x013CHS address of first sector in partition
0x041partition type
0x053CHS address of last sector in partition
0x084LBA of first sector in the partition
0x0C4number of sectors in partition, in little-endian format

Most of the time, LBA is used to find a partition. But specification says: if a partition's start block or end block or both are under the 8.4 GB limitation, CHS address should also be correctly record. Otherwise, CHS fields have some kind of default values.
Partition type is used to label the file system used on this partition. For example, the code for linux ext2 is 0x83 and linux swap is 0x82. You can see a list of partition types by sfdisk -T. A hard disk can have at most four primary partitions for there are only four entries in the primary partition table. The following figure gives an example of a hard disk holding two primary partitions.


If you ls /dev/sda* or ls /dev/hda*, you may see the results as follows:
/dev/sda /dev/sda1 /dev/sda2   or


/dev/hda /dev/hda1 /dev/hda2
Please note:
  1. The address mode used in figure is LBA. In CHS dialect, it should be Sector 1 - Sector 63.
  2. The first partition normally starts at sector 63 (LBA), that is just after the first track. The first 63 sectors (first track) can be used for other purpose such as holding bootloader code.
  3. Partition can start and end at any places as soon as there are no overlappings. And may not cover all the space on a hard disk.
To get more partitions, we can subpartition a primary partition into several logical partitions. The primary partition used to house the logical partitions is called an extended partition and it has its own file system type (0x05 extended type). See more at Extended partition.

The story of storage: extended partition



Extended Partition

As we mentioned in Hard Disk, a hard disk can have at most four primary partitions. If we want more partitions, we can change one primary partition into an extended one by subdividing it into logic ones and setting the partition type to 0x5 (extended type).
Like Master Boot Record (MBR) describing a hard disk, a Extended Boot Record (EBR) is used for a logic partition. However, there is one EBR for each logic partition and all the logic partitions in a extended partition is linked one by one using two partition table records in MBR.
EBRs have essentially the same structure as the MBR; except only the first two entries of the partition table are supposed to be used.
The structure of EBR is as follows:

OffsetDescriptionSize
0x0000Generally unused446
0x01BEPartition Table's First entry16
0x1CEPartition Table's Second entry16
0x1DEUnused32
0x1FEMBR signature (0x55, 0xAA)2

Follows is the layout of one 16-byte partition record:

OffsetLengthDescription
0x001status (0x80 = bootable, 0x00 = non-bootable, other = invalid)
0x013CHS address of first sector in partition
0x041partition type
0x053CHS address of last sector in partition
0x084Starting Sector
0x0C4Number of sectors in partition, in little-endian format

The first entry of an EBR partition table points to the logical partition belonging to that EBR:
  • Starting Sector = relative offset between this EBR sector and the first sector of the logical partition This will be the same value for each EBR on the same hard disk; usually 63.
  • Number of Sectors = total count of sectors for this logical partition
The second entry of an EBR partition table will contain zero-bytes if it's the last EBR in the extended partition; otherwise, it points to the next EBR in the EBR chain:
  • Starting Sector = relative address of next EBR within extended partition in other words: Starting Sector = LBA address of next EBR minus LBA address of extended partition's first EBR
  • Number of Sectors = total count of sectors for next logical partition, but count starts from the next EBR sector
The following figure gives an example of a hard disk holding an extended partition and a primary partition. There are two logic partitions in the extended partition.