Hierarchical File System


Hierarchical File System is a proprietary file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard, while its successor, HFS Plus, is also called Mac OS Extended.
With the introduction of Mac OS X 10.6, Apple dropped support for formatting or writing HFS disks and images, which remain supported as read-only volumes. Starting with macOS 10.15, HFS disks can no longer be read.

History

Apple introduced HFS in September 1985, specifically to support Apple's first hard disk drive for the Macintosh, replacing the Macintosh File System, the original file system which had been introduced over a year and a half earlier with the first Macintosh computer. HFS drew heavily upon Apple's first hierarchical operating system for the failed Apple III, which also served as the basis for hierarchical file systems on the Apple IIe and Apple Lisa. HFS was developed by Patrick Dirks and Bill Bruffey. It shared a number of design features with MFS that were not available in other file systems of the time. Files could have multiple forks, which allowed the main data of the file to be stored separately from resources such as icons that might need to be localized. Files were referenced with unique file IDs rather than file names, and file names could be 255 characters long.
However, MFS had been optimized to be used on very small and slow media, namely floppy disks, so HFS was introduced to overcome some of the performance problems that arrived with the introduction of larger media, notably hard drives. The main concern was the time needed to display the contents of a folder. Under MFS all of the file and directory listing information was stored in a single file, which the system had to search to build a list of the files stored in a particular folder. This worked well with a system with a few hundred kilobytes of storage and perhaps a hundred files, but as the systems grew into megabytes and thousands of files, the performance degraded rapidly.
The solution was to replace MFS's directory structure with one more suitable to larger file systems. HFS replaced the flat table structure with the Catalog File which uses a B-tree structure that could be searched very quickly regardless of size. HFS also redesigned various structures to be able to hold larger numbers, 16-bit integers being replaced by 32-bit almost universally. Oddly, one of the few places this "upsizing" did not take place was the file directory itself, which limits HFS to a total of 65,535 files on each logical disk.
While HFS is a proprietary file system format, it is well-documented; there are usually solutions available to access HFS-formatted disks from most modern operating systems.
Apple introduced HFS out of necessity with its first 20 MB hard disk offering for the Macintosh in September 1985, where it was loaded into RAM from a MFS floppy disk on boot using a patch file. However, HFS was not widely introduced until it was included in the 128K ROM that debuted with the Macintosh Plus in January 1986 along with the larger 800 KB floppy disk drive for the Macintosh that also used HFS. The introduction of HFS was the first advancement by Apple to leave a Macintosh computer model behind: the original 128K Macintosh, which lacked sufficient memory to load the HFS code and was promptly discontinued.
In 1998, Apple introduced HFS Plus to address inefficient allocation of disk space in HFS and to add other improvements. HFS is still supported by current versions of Mac OS, but starting with Mac OS X, an HFS volume cannot be used for booting, and beginning with Mac OS X 10.6, HFS volumes are read-only and cannot be created or updated. In macOS Sierra, Apple's release notes state that "The HFS Standard filesystem is no longer supported." However, read-only HFS Standard support is still present in Sierra and works as it did in previous versions.

Design

A storage volume is inherently divided into logical blocks of 512 bytes. The Hierarchical File System groups these logical blocks into allocation blocks, which can contain one or more logical blocks, depending on the total size of the volume. HFS uses a 16-bit value to address allocation blocks, limiting the number of allocation blocks to 65,535.
Five structures make up an HFS volume:
  1. Logical blocks 0 and 1 of the volume are the Boot Blocks, which contain system startup information. For example, the names of the System and Shell files which are loaded at startup.
  2. Logical block 2 contains the Master Directory Block. This defines a wide variety of data about the volume itself, for example date & time stamps for when the volume was created, the location of the other volume structures such as the Volume Bitmap or the size of logical structures such as allocation blocks. There is also a duplicate of the MDB called the Alternate Master Directory Block located at the opposite end of the volume in the second to last logical block. This is intended mainly for use by disk utilities and is only updated when either the Catalog File or Extents Overflow File grow in size.
  3. Logical block 3 is the starting block of the Volume Bitmap, which keeps track of which allocation blocks are in use and which are free. Each allocation block on the volume is represented by a bit in the map: if the bit is set then the block is in use; if it is clear then the block is free to be used. Since the Volume Bitmap must have a bit to represent each allocation block, its size is determined by the size of the volume itself.
  4. The Extent Overflow File is a B-tree that contains extra extents that record which allocation blocks are allocated to which files, once the initial three extents in the Catalog File are used up. Later versions also added the ability for the Extent Overflow File to store extents that record bad blocks, to prevent the file system from trying to allocate a bad block to a file.
  5. The Catalog File is another B-tree that contains records for all the files and directories stored in the volume. It stores four types of records. Each file consists of a File Thread Record and a File Record while each directory consists of a Directory Thread Record and a Directory Record. Files and directories in the Catalog File are located by their unique Catalog Node ID.
  6. * A File Thread Record stores just the name of the file and the CNID of its parent directory.
  7. * A File Record stores a variety of metadata about the file including its CNID, the size of the file, three timestamps, the first file extents of the data and resource forks and pointers to the file's first data and resource extent records in the Extent Overflow File. The File Record also stores two 16 byte fields that are used by the Finder to store attributes about the file including things like its creator code, type code, the window the file should appear in and its location within the window.
  8. * A Directory Thread Record stores just the name of the directory and the CNID of its parent directory.
  9. * A Directory Record which stores data like the number of files stored within the directory, the CNID of the directory, three timestamps. Like the File Record, the Directory Record also stores two 16 byte fields for use by the Finder. These store things like the width & height and x & y co-ordinates for the window used to display the contents of the directory, the display mode of the window and the position of the window's scroll bar.

    Limitations

The Catalog File, which stores all the file and directory records in a single data structure, results in performance problems when the system allows multitasking, as only one program can write to this structure at a time, meaning that many programs may be waiting in queue due to one program "hogging" the system. It is also a serious reliability concern, as damage to this file can destroy the entire file system. This contrasts with other file systems that store file and directory records in separate structures, where having structure distributed across the disk means that damaging a single directory is generally non-fatal and the data may possibly be re-constructed with data held in the non-damaged portions.
Additionally, the limit of 65,535 allocation blocks resulted in files having a "minimum" size equivalent 1/65,535th the size of the disk. Thus, any given volume, no matter its size, could only store a maximum of 65,535 files. Moreover, any file would be allocated more space than it actually needed, up to the allocation block size. When disks were small, this was of little consequence, because the individual allocation block size was trivial, but as disks started to approach the 1 GB mark, the smallest amount of space that any file could occupy became excessively large, wasting significant amounts of disk space. For example, on a 1 GB disk, the allocation block size under HFS is 16 KB, so even a 1 byte file would take up 16 KB of disk space. This situation was less of a problem for users having large files because these larger files wasted less space as a percentage of their file size. Users with many small files, on the other hand, could lose a copious amount of space due to large allocation block size. This made partitioning disks into smaller logical volumes very appealing for Mac users, because small documents stored on a smaller volume would take up much less space than if they resided on a large partition. The same problem existed in the FAT16 file system.
HFS saves the case of a file that is created or renamed but is case-insensitive in operation.
According to bombich.com, HFS is no longer supported on Catalina and future macOS releases.