person Tute World Team schedule Updated April 28, 2026

Out of Memory — What Linux Does

Your server's RAM is full. Swap is full. A process is trying to allocate more memory. What happens? Linux doesn't just crash — it tries several things before reaching for the OOM killer.

The Memory Exhaustion Sequence

1. Memory allocation requested (malloc, page fault, etc.)
   ↓
2. Free pages available? → Allocate immediately. Done.
   ↓ (no free pages)
3. Try to reclaim memory:
   a. Shrink page cache (drop clean file-backed pages)
   b. Shrink slab caches (dentries, inodes)
   c. Try to write dirty pages to disk
   d. Swap out anonymous pages (if swap available)
   ↓ (reclaim not enough)
4. OOM killer activated
   a. Calculate "badness score" for each process
   b. Kill the worst offender
   c. Retry allocation
   ↓ (still can't allocate after multiple attempts)
5. Kernel panic (if panic_on_oom=1) or hang

Memory Reclaim — First Defense

What is the page cache and why does the kernel shrink it first? The page cache holds cached copies of files from disk. If a process needs more RAM, the kernel can drop clean (unmodified) file pages — if they're needed again, they'll just be re-read from disk. This is "free" reclaim with no data loss.

Two reclaim paths:

Background reclaim (kswapd): A kernel thread that proactively frees pages to keep a buffer of free memory. You usually don't notice this happening.
Direct reclaim: The process that needs memory waits while the kernel synchronously reclaims. This causes latency spikes. If you see applications stalling briefly, direct reclaim is likely happening.

# Monitor reclaim activity
vmstat 1
# si/so columns = pages swapped in/out per second
# If so > 0 consistently, swap is being used heavily

OOM Trigger Conditions

Exactly when does the OOM killer activate? When the kernel cannot satisfy an allocation request after genuine attempts at reclaim and swap. It's the absolute last resort — not triggered just because memory is "high." The kernel is quite persistent about reclaiming before giving up.

# Check if OOM killer has fired recently
dmesg | grep -i "out of memory"
dmesg | grep -i "killed process"
# Output looks like:
# Out of memory: Kill process 12345 (java) score 872 or sacrifice child
# Killed process 12345 (java) total-vm:8428756kB, rss:7284256kB

Preventing OOM

Set proper memory limits per process/container with cgroups
Add swap space as a safety buffer
Monitor with sar -r, Prometheus node_exporter, or CloudWatch
Alert at 80% memory usage, not 100%
Consider vm.overcommit_memory=2 on servers where you need predictability

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.