File system API
A file system API is an application programming interface through which a utility or user program requests services of a file system. An operating system may provide abstractions for accessing different file systems transparently.
Some file system APIs may also include interfaces for maintenance operations, such as creating or initializing a file system, verifying the file system for integrity, and defragmentation.
Each operating system includes the APIs needed for the file systems it supports. Microsoft Windows has file system APIs for NTFS and several FAT file systems. Linux systems can include APIs for ext2, ext3, ReiserFS, and Btrfs to name a few.
History
Some early operating systems were capable of handling only tape and disk file systems. These provided the most basic of interfaces with:- Write, read and position
- Open and close
- Metadata management
- File system maintenance
- Directory management
- Data structure management
- Record management
- Non-data operations
- Sharing
- Restricting access
- Encryption
API overviews
Write, read and position
Writing user data to a file system is provided for use directly by the user program or the run-time library. The run-time library for some programming languages may provide type conversion, formatting and blocking. Some file systems provide identification of records by key and may include re-writing an existing record. This operation is sometimes called orPUTX
Reading user data, sometimes called , may include a direction or in the case of a keyed file system, a specific key. As with writing run-time libraries may intercede for the user program.
Positioning includes adjusting the location of the next record. This may include skipping forward or reverse as well as positioning to the beginning or end of the file.
Open and close
The open API may be explicitly requested or implicitly invoked upon the issuance of the first operation by a process on an object. It may cause the mounting of removable media, establishing a connection to another host and validating the location and accessibility of the object. It updates system structures to indicate that the object is in use.Usual requirements for requesting access to a file system object include:
- The object which is to be accessed
- The intended type of operations to be performed after the open
- a password
- a declaration that other processes may access the same object while the opening process is using the object. This may depend on the intent of the other process. In contrast, a declaration that no other process may access the object regardless of the other processes intent.
It must be expected that something may go wrong during the processing of the open.
- The object or intent may be improperly specified.
- The process may be prohibited from accessing the object.
- The file system may be unable to create or update structures required to coordinate activities among users.
- In the case of a new object, there may not be sufficient capacity on the media.
Close may cause dismounting or ejecting removable media and updating library and file system structures to indicate that the object is no longer in use.
The minimal specification to the close references the object. Additionally, some file systems provide specifying a disposition of the object which may indicate the object is to be discarded and no longer be part of the file system.
Similar to the open, it must be expected that something may go wrong.
- The specification of the object may be incorrect.
- There may not be sufficient capacity on the media to save any data being buffered or to output a structure indicating that the object was successfully updated.
- A device error may occur on the media where the object is stored while writing buffered data, the completion structure or updating meta data related to the object.
- A specification to release the object may be inconsistent with other processes still using the object.
Metadata management
Information about the data in a file is called metadata.Some of the metadata is maintained by the file system, for example last-modification date,
location of the beginning of the file, the size of the file and if the file system backup utility has saved the current version of the files. These items cannot usually be altered by a user program.
Additional meta data supported by some file systems may include the owner of the file, the group to which the file belongs as well as permissions and/or access control, and whether the file is normally visible when the directory is listed. These items are usually modifiable by file system utilities which may be executed by the owner.
Some applications store more metadata. For images the metadata may include the camera model and settings used to take the photo. For audio files, the meta data may include the album, artist who recorded the recording and comments about the recording which may be specific to a particular copy of the file. Documents may include items like checked-by, approved-by, etc.
Directory management
Renaming a file, moving a file from one directory to another and deleting a file are examples of the operations provide by the file system for the management of directories.Metadata operations such as permitting or restricting access the a directory by various users or groups of users are usually included.
Filesystem maintenance
As a filesystem is used directories, files and records may be added, deleted or modified. This usually causes inefficiencies in the underlying data structures. Things like logically sequential blocks distributed across the media in a way that causes excessive repositioning, partially used even empty blocks included in linked structures. Incomplete structures or other inconsistencies may be caused by device or media errors, inadequate time between detection of impending loss of power and actual power loss, improper system shutdown or media removal, and on very rare occasions file system coding errors.Specialized routines in the file system are included to optimize or repair these structures. They are not usually invoked by the user directly but triggered within the file system itself. Internal counters of the number of levels of structures, number of inserted objects may be compared against thresholds. These may cause user access to be suspended to a specific structure or may be started as low priority asynchronous tasks or they may be deferred to a time of low user activity. Sometimes these routines are invoked or scheduled by the system manager or as in the case of defragmentation.
Kernel-level API
The API is "kernel-level" when the kernel not only provides the interfaces for the filesystems developers but is also the space in which the filesystem code resides.It differs with the old schema in that the kernel itself uses its own facilities to talk with the filesystem driver and vice versa, as contrary to the kernel being the one that handles the filesystem layout and the filesystem the one that directly access the hardware.
It is not the cleanest scheme but resolves the difficulties of major rewrite that has the old scheme.
With modular kernels it allows adding filesystems as any kernel module, even third party ones. With non-modular kernels however it requires the kernel to be recompiled with the new filesystem code.
Unixes and Unix-like systems such as Linux have used this modular scheme.
There is a variation of this scheme used in MS-DOS and compatibles to support CD-ROM and network file systems. Instead of adding code to the kernel, as in the old scheme, or using kernel facilities as in the kernel-based scheme, it traps all calls to a file and identifies if it should be redirected to the kernel's equivalent function or if it has to be handled by the specific filesystem driver, and the filesystem driver "directly" access the disk contents using low-level BIOS functions.
Driver-based API
The API is "driver-based" when the kernel provides facilities but the file system code resides totally external to the kernel.It is a cleaner scheme as the filesystem code is totally independent, it allows filesystems to be created for closed-source kernels and online filesystem additions or removals from the system.
Examples of this scheme are the Windows NT and OS/2 respective IFSs.
Mixed kernel-driver-based API
In this API all filesystems are in the kernel, like in kernel-based APIs, but they are automatically trapped by another API, that is driver-based, by the OS.This scheme was used in Windows 3.1 for providing a FAT filesystem driver in 32-bit protected mode, and cached, that bypassed the DOS FAT driver in the kernel completely, and later in the Windows 9x series for VFAT, the ISO9660 filesystem driver, network shares, and third party filesystem drivers, as well as adding to the original DOS APIs the LFN API.
However that API was not completely documented, and third parties found themselves in a "make-it-by-yourself" scenario even worse than with kernel-based APIs.
User space API
The API is in the user space when the filesystem does not directly use kernel facilities but accesses disks using high-level operating system functions and provides functions in a library that a series of utilities use to access the filesystem.This is useful for handling disk images.
The advantage is that a filesystem can be made portable between operating systems as the high-level operating system functions it uses can be as common as ANSI C, but the disadvantage is that the API is unique to each application that implements one.
Examples of this scheme are the and the .
Interoperatibility between file system APIs
As all filesystems need equivalent functions provided by the kernel, it is possible to easily port a filesystem code from one API to another, even if they are of different types.For example, the ext2 driver for OS/2 is simply a wrapper from the Linux's VFS to the OS/2's IFS and the Linux's ext2 kernel-based, and the HFS driver for OS/2 is a port of the hfsutils to the OS/2's IFS. There also exists a project that uses a Windows NT IFS driver for making NTFS work under Linux.