Docker Internals
"Docker" is really four separate tools working together. When you run docker run nginx, a chain of processes kicks off — the Docker daemon, containerd, and finally runc — each doing a specific job. Understanding this stack demystifies container crashes, performance issues, and security incidents.
The Docker Stack
docker run nginx
|
Docker CLI sends API request to dockerd
|
dockerd receives request
- pulls image if not present
- creates container config
- calls containerd via gRPC
|
containerd receives container spec
- manages image snapshots (overlayfs)
- prepares container bundle (rootfs + config.json)
- spawns containerd-shim
|
containerd-shim
- stays alive even if containerd restarts
- manages stdin/stdout/stderr
- reports container exit status
|
runc (OCI runtime)
- reads config.json (OCI spec)
- calls unshare()/clone() for namespaces
- sets up cgroup
- applies seccomp profile
- drops capabilities
- calls pivot_root
- executes entrypoint (nginx)
runc — The Actual Container Creator
What does runc do that makes something a "container"?
runc reads an OCI (Open Container Initiative) bundle — a directory with a rootfs/ folder and a config.json file. config.json specifies namespaces, cgroups, capabilities, seccomp profile, mounts, and the command to run. runc applies all these in the right order, then exec's the process. It exits once the container starts — the shim watches from there.
# Create an OCI bundle manually (advanced):
mkdir -p /tmp/mycontainer/rootfs
runc spec # generate template config.json
ls /tmp/mycontainer/
# config.json rootfs/
# config.json selects what namespaces to create:
# "namespaces": [
# {"type": "pid"},
# {"type": "network"},
# {"type": "ipc"},
# {"type": "uts"},
# {"type": "mount"}
# ]
# Run the bundle:
runc run mycontainer
# Observe runc creating the container:
strace -f runc run mycontainer 2>&1 | grep -E "clone|unshare|pivot"
# clone(CLONE_NEWPID|CLONE_NEWNET|CLONE_NEWNS|...)
# unshare(CLONE_NEWUTS)
# pivot_root("rootfs", "rootfs/.pivot_root")
Step-by-Step: docker run nginx
Step 1: Image pull (if not cached)
dockerd contacts registry.hub.docker.com
Downloads manifest, then layer blobs (tar.gz)
Stores layers in /var/lib/docker/overlay2/
Step 2: Container setup
Creates writable layer on top of image layers (overlayfs)
Assigns container ID (64-char hex)
Creates network namespace + veth pair
Connects veth to docker0 bridge
Assigns IP from 172.17.0.0/16
Step 3: runc launches process
New PID namespace (nginx gets PID 1 inside)
New mount namespace (sees only overlayfs root)
New UTS namespace (hostname = container ID)
New IPC namespace
cgroup created: /sys/fs/cgroup/.../docker-ID.scope
seccomp profile applied
Capabilities dropped (kept: ~14 of 40)
pivot_root to overlayfs mountpoint
exec /docker-entrypoint.sh nginx -g 'daemon off;'
Step 4: Container running
containerd-shim owns the process
dockerd monitors via events
Port mapping: iptables DNAT rule created
Inspecting a Running Container's Internals
# Get container PID on host
docker inspect myapp --format '{{.State.Pid}}'
# 12345
# See its namespaces:
ls -la /proc/12345/ns/
# Enter its namespaces (like docker exec):
nsenter -t 12345 --pid --net --mount bash
# See overlayfs layers:
docker inspect myapp --format '{{.GraphDriver}}'
# UpperDir, LowerDir, MergedDir, WorkDir paths
# Check its cgroup:
cat /proc/12345/cgroup
# 0::/system.slice/docker-abc123.scope
# View applied seccomp profile:
cat /proc/12345/status | grep Seccomp
# Seccomp: 2 (2 = BPF filter active)
# Check capabilities:
cat /proc/12345/status | grep Cap
Frequently Asked Questions
What will I learn here?
This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.
How should I use this page?
Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.
What should I read next?
Use the navigation below to continue to the next lesson or explore related topics.