Out of Memory — What Linux Does
Your server's RAM is full. Swap is full. A process is trying to allocate more memory. What happens? Linux doesn't just crash — it tries several things before reaching for the OOM killer.
The Memory Exhaustion Sequence
1. Memory allocation requested (malloc, page fault, etc.)
↓
2. Free pages available? → Allocate immediately. Done.
↓ (no free pages)
3. Try to reclaim memory:
a. Shrink page cache (drop clean file-backed pages)
b. Shrink slab caches (dentries, inodes)
c. Try to write dirty pages to disk
d. Swap out anonymous pages (if swap available)
↓ (reclaim not enough)
4. OOM killer activated
a. Calculate "badness score" for each process
b. Kill the worst offender
c. Retry allocation
↓ (still can't allocate after multiple attempts)
5. Kernel panic (if panic_on_oom=1) or hang
Memory Reclaim — First Defense
What is the page cache and why does the kernel shrink it first?
The page cache holds cached copies of files from disk. If a process needs more RAM, the kernel can drop clean (unmodified) file pages — if they're needed again, they'll just be re-read from disk. This is "free" reclaim with no data loss.
Two reclaim paths:
- Background reclaim (kswapd): A kernel thread that proactively frees pages to keep a buffer of free memory. You usually don't notice this happening.
- Direct reclaim: The process that needs memory waits while the kernel synchronously reclaims. This causes latency spikes. If you see applications stalling briefly, direct reclaim is likely happening.
# Monitor reclaim activity
vmstat 1
# si/so columns = pages swapped in/out per second
# If so > 0 consistently, swap is being used heavily
OOM Trigger Conditions
Exactly when does the OOM killer activate?
When the kernel cannot satisfy an allocation request after genuine attempts at reclaim and swap. It's the absolute last resort — not triggered just because memory is "high." The kernel is quite persistent about reclaiming before giving up.
# Check if OOM killer has fired recently
dmesg | grep -i "out of memory"
dmesg | grep -i "killed process"
# Output looks like:
# Out of memory: Kill process 12345 (java) score 872 or sacrifice child
# Killed process 12345 (java) total-vm:8428756kB, rss:7284256kB
Preventing OOM
- Set proper memory limits per process/container with cgroups
- Add swap space as a safety buffer
- Monitor with
sar -r, Prometheus node_exporter, or CloudWatch - Alert at 80% memory usage, not 100%
- Consider
vm.overcommit_memory=2on servers where you need predictability
Frequently Asked Questions
What will I learn here?
This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.
How should I use this page?
Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.
What should I read next?
Use the navigation below to continue to the next lesson or explore related topics.