top & htop
Everyone knows how to run top. Far fewer people know what they're actually looking at. Load average, us/sy/id/wa CPU percentages, RES vs VIRT memory, the difference between S and D process states — misreading these leads to wrong conclusions about performance problems.
Load Average — What It Really Means
What is load average and when is it "too high"?
Load average is the average number of processes in the run queue (running or waiting for CPU) plus processes in uninterruptible sleep (waiting for I/O) over the last 1, 5, and 15 minutes. On a 4-core system, a load of 4.0 means the CPU is exactly saturated. Load of 8.0 = 2x overloaded. But: high load from I/O wait (D state processes) doesn't mean the CPU is overloaded — it's idle, waiting for disk.
top - 15:42:03 load average: 2.15, 1.87, 1.92
# 1min 5min 15min
# Interpreting load:
# Cores: 4
# Load 2.15 = 2.15 tasks queued on average
# Load/Cores = 2.15/4 = 53% CPU utilization equivalent
# Load consistently > cores = CPU saturated
# Load rising over time = worsening problem
# Load high but CPU idle = I/O bound (waiting for disk/network)
# Better load check:
uptime
# 15:42:03 up 30 days, load average: 2.15, 1.87, 1.92
CPU States — What Each Percentage Means
%Cpu(s): 12.5 us, 3.2 sy, 0.0 ni, 80.1 id, 4.1 wa, 0.0 hi, 0.1 si, 0.0 st
# us (user) = CPU time in user-space code (your app)
# sy (system) = CPU time in kernel (syscalls, interrupts)
# ni (nice) = user space with positive nice value (low priority)
# id (idle) = CPU idle, nothing to do
# wa (iowait) = CPU idle but waiting for I/O to complete
# hi (hardware interrupt) = time handling hardware IRQs
# si (software interrupt) = time in softirqs (network, timers)
# st (steal) = CPU stolen by hypervisor (in VMs)
# Diagnosis patterns:
# High us + low id: CPU bound (your code uses too much CPU)
# High sy + low us: too many syscalls or kernel work
# High wa: disk or network I/O bottleneck
# High st: VM is over-committed on host
# us + sy > 80%: system under CPU pressure
Memory Columns — VIRT vs RES vs SHR
PID VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1234 1.2g 345m 89m S 2.3 4.3 12:34 nginx
# VIRT (Virtual memory):
# Total virtual address space claimed by the process.
# Includes code, data, shared libs, mmapped files, even unmapped anonymous.
# Usually much larger than actual RAM used. Mostly meaningless.
# RES (Resident Set Size):
# Physical RAM actually in use RIGHT NOW.
# This is what you care about for "how much RAM does this process use"
# Includes shared libraries (counted even if shared with other processes)
# SHR (Shared memory):
# Portion of RES that's shared with other processes.
# Shared libraries, shared mmap files.
# If 50 processes share libc, SHR shows libc's size in each.
# True "private" RAM = RES - SHR
# Total system RAM for process = RES (but SHR counted once for system total)
# Better memory view:
cat /proc/1234/smaps_rollup
# Rss: 354432 kB (same as RES)
# Pss: 134567 kB (proportional share — SHR divided among users)
# Private_Clean: 45678 kB
# Private_Dirty: 123456 kB ← this process exclusively owns this
htop — What Makes It Better
# Install: apt install htop
# htop advantages over top:
# - Per-CPU bar graphs (see which CPUs are busy)
# - Better color coding (user/system/IO wait each different color)
# - Easy sorting: click column header or F6
# - Tree view: F5 (shows parent-child relationships)
# - Kill without leaving: select process, F9
# - Search: F3
# - Scroll horizontally: see full command
# Key htop columns:
# PID = process ID
# USER = owner
# PRI/NI = priority/nice (-20 high priority, 19 low)
# VIRT/RES/SHR = same as top
# S = state (R=running, S=sleep, D=uninterruptible, Z=zombie)
# CPU% = CPU usage (can exceed 100% on multi-core)
# MEM% = % of total RAM
# TIME+ = total CPU time used
# Interactive keys:
# u = filter by user
# k = kill process
# r = renice (change priority)
# s = strace the process
# l = lsof the process (list open files)
# I = invert sort order
Frequently Asked Questions
What will I learn here?
This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.
How should I use this page?
Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.
What should I read next?
Use the navigation below to continue to the next lesson or explore related topics.