Behind the Scenes of a File System
You've learned what files and directories look like from the user's perspective. But how does the file system actually store all of this on disk? What data structures does it use to keep track of where files are, which blocks are free, and how directories are organized?
The implementation details vary between file systems, but they all solve the same fundamental problems: mapping file names to disk blocks, managing free space, and ensuring data integrity.
Inodes: The File's Identity
On Unix and Linux file systems, every file has an inode (index node). The inode is a data structure that contains all the metadata about a file β except its name. It stores the file size, owner, permissions, timestamps, and pointers to the data blocks on disk.
When you open a file, the system first looks up the file name in the directory to find its inode number. Then it reads the inode to find where the actual data blocks are. The inode is the file's true identity β the name is just a label.
This is why you can rename a file without moving its data, and why hard links work β multiple names can point to the same inode.
Allocation Methods
The file system needs to decide how to assign disk blocks to files. There are three main approaches:
- Contiguous Allocation β Each file occupies aθΏη» set of blocks. Simple and fast for reading (sequential access), but suffers from external fragmentation and requires knowing the file size in advance.
- Linked Allocation β Each file is a linked list of blocks scattered across the disk. Each block contains a pointer to the next block. No external fragmentation, but slow for random access (you have to follow the chain).
- Indexed Allocation β Each file has an index block that contains pointers to all its data blocks. This allows direct access to any block without following a chain. Used by Unix file systems (inodes contain these indices).
Free Space Management
The file system must track which disk blocks are free and which are in use. Common methods include:
- Bit Vector β A bitmap where each bit represents a block. 0 = free, 1 = occupied. Simple but can consume significant space for large disks.
- Linked List β Free blocks are linked together. The head of the list points to the first free block, which points to the next, and so on. Space-efficient but slow to find a contiguous run.
- Grouping β Store the addresses of N free blocks in the first free block, and a pointer to the next group in the last block. Reduces the number of disk reads needed to find free blocks.
Journaling: Protecting Against Corruption
What happens if the system crashes while you're saving a file? Without protection, the file could be partially written β some blocks updated, others not β leaving it in a corrupted state.
Journaling file systems solve this by keeping a log (journal) of intended changes before actually making them. The process is:
- Write the intended changes to the journal.
- Mark the journal entry as committed.
- Apply the changes to the actual file system.
- Remove the journal entry.
If a crash occurs during step 3, the system can replay the journal to complete the changes. If a crash occurs during step 1, the incomplete journal entry is discarded. Either way, the file system remains consistent. NTFS, ext4, and APFS all use journaling.