Virtual Filesystem (VFS)
VFS is the reason you can run open("/proc/cpuinfo"), open("/home/user/file.txt"), and open("/mnt/nfs/data") with the exact same system call — even though one is a kernel pseudo-file, one is on ext4, and one is on a network drive. VFS is the abstraction layer that makes all of this possible.
The Four Core VFS Objects
What data structures does the kernel use to represent files?
VFS defines four key objects that every filesystem driver must implement. Think of them as a contract: any filesystem that fills in these objects can be mounted on Linux.
| Object | What it represents | Key fields |
|---|---|---|
| superblock | A mounted filesystem | block size, inode count, ops table |
| inode | A file or directory (metadata) | size, permissions, timestamps, data block pointers |
| dentry | A directory entry (name → inode mapping) | filename, parent dentry, inode pointer, LRU cache |
| file | An open file description | current offset, flags, f_op pointer, inode pointer |
How open("/home/user/file.txt") Works
What actually happens inside the kernel when I call open()?
The kernel walks the path component by component using the dentry cache (dcache). Each component is looked up in the parent directory's dentry. If found in cache, no disk read needed. If not, the filesystem driver reads the directory from disk to find the inode number.
// Path walk for "/home/user/file.txt"
// Step 1: start at root dentry (inode 2)
// Step 2: look up "home" in root dentry → inode 12
// Step 3: look up "user" in inode 12 → inode 347
// Step 4: look up "file.txt" in inode 347 → inode 8921
// Step 5: check permissions on inode 8921
// Step 6: allocate a struct file, point it at inode 8921
// Step 7: return file descriptor (index into process fd table)
// The dentry cache accelerates this:
// dcache stores (parent_dentry, name) → child_dentry mappings
// Hot paths are pure cache lookups — no disk I/O
cat /proc/sys/fs/dentry-state
// nr_dentry nr_unused age_limit want_pages dummy
// 847392 612847 45 0 0
file_operations — The VFS Contract
How does VFS call the right code for ext4 vs NFS vs /proc?
Every inode has a pointer to a
file_operations struct — a table of function pointers. When your code calls read(), VFS calls inode->f_op->read(). For ext4, that function reads from disk. For /proc, it calls a kernel function that generates data on the fly. Same interface, completely different implementation.
// Simplified file_operations struct (from linux/fs.h)
struct file_operations {
ssize_t (*read)(struct file *, char __user *, size_t, loff_t *);
ssize_t (*write)(struct file *, const char __user *, size_t, loff_t *);
int (*open)(struct inode *, struct file *);
int (*release)(struct inode *, struct file *);
loff_t (*llseek)(struct file *, loff_t, int);
// ... more operations
};
// ext4 fills this in with ext4_file_operations
// proc fills this in with proc_file_operations
// NFS fills this in with nfs_file_operations
// Your code always calls the same read() syscall — VFS dispatches
Mount Points and the Filesystem Tree
How does Linux merge multiple filesystems into one unified tree?
When you mount a filesystem, the kernel creates a
vfsmount struct that links the new filesystem's root dentry to a dentry in the existing tree. Path resolution transparently crosses mount points — when the walker hits a dentry that is a mount point, it follows the mount into the new filesystem.
# View all active mount points:
cat /proc/mounts
# sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
# proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
# /dev/sda1 / ext4 rw,relatime 0 0
# tmpfs /tmp tmpfs rw,nosuid,nodev 0 0
# overlay / overlay rw,relatime (inside Docker container)
# Mount a filesystem:
mount /dev/sdb1 /mnt/data # attach ext4 at /mnt/data
mount -t tmpfs tmpfs /tmp # in-memory filesystem
mount --bind /src /dst # bind mount: same files, different path
# Namespace-aware mount info:
cat /proc/self/mountinfo # more detailed than /proc/mounts
Inside an Inode
# Inspect inode details:
stat /etc/passwd
# File: /etc/passwd
# Size: 2847 Blocks: 8 IO Block: 4096 regular file
# Device: 8,1 Inode: 917514 Links: 1
# Access: (0644/-rw-r--r--) Uid: (0/root) Gid: (0/root)
# Access: 2024-01-15 09:23:11
# Modify: 2024-01-10 14:30:02
# Change: 2024-01-10 14:30:02
# Key fields stored in inode (NOT the filename — that's in the dentry):
# - file type (regular, directory, symlink, device, socket, pipe)
# - permissions (rwxrwxrwx + SUID/SGID/sticky)
# - owner UID and GID
# - timestamps: atime (last access), mtime (last content change), ctime (last metadata change)
# - link count (how many directory entries point to this inode)
# - size in bytes
# - block pointers (or inline data for small files in ext4)
# Find all files sharing an inode (hard links):
find / -inum 917514 2>/dev/null
Dentry Cache (dcache) Performance
Why is repeated file access so much faster than the first access?
The dcache keeps recently-used dentries in memory. A path lookup that hits the dcache avoids all disk I/O — it's pure in-memory pointer chasing. The cache is LRU-evicted under memory pressure. On a busy system with lots of file activity, dcache hit rates are typically 95%+.
# Monitor dcache stats:
cat /proc/sys/fs/dentry-state
# 847392 612847 45 0 0 0
# total unused age_limit want_pages dummy dummy
# Inode cache stats:
cat /proc/sys/fs/inode-state
# 234891 45823 0 0 0 0 0
# total free ...
# Force drop caches (for benchmarking — do not do in production):
echo 2 > /proc/sys/vm/drop_caches # drop dentries + inodes
echo 3 > /proc/sys/vm/drop_caches # drop page cache too
Frequently Asked Questions
What will I learn here?
This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.
How should I use this page?
Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.
What should I read next?
Use the navigation below to continue to the next lesson or explore related topics.