Docker Internals

"Docker" is really four separate tools working together. When you run docker run nginx, a chain of processes kicks off — the Docker daemon, containerd, and finally runc — each doing a specific job. Understanding this stack demystifies container crashes, performance issues, and security incidents.

The Docker Stack

docker run nginx | Docker CLI sends API request to dockerd | dockerd receives request - pulls image if not present - creates container config - calls containerd via gRPC | containerd receives container spec - manages image snapshots (overlayfs) - prepares container bundle (rootfs + config.json) - spawns containerd-shim | containerd-shim - stays alive even if containerd restarts - manages stdin/stdout/stderr - reports container exit status | runc (OCI runtime) - reads config.json (OCI spec) - calls unshare()/clone() for namespaces - sets up cgroup - applies seccomp profile - drops capabilities - calls pivot_root - executes entrypoint (nginx)

runc — The Actual Container Creator

What does runc do that makes something a "container"? runc reads an OCI (Open Container Initiative) bundle — a directory with a rootfs/ folder and a config.json file. config.json specifies namespaces, cgroups, capabilities, seccomp profile, mounts, and the command to run. runc applies all these in the right order, then exec's the process. It exits once the container starts — the shim watches from there.
# Create an OCI bundle manually (advanced): mkdir -p /tmp/mycontainer/rootfs runc spec # generate template config.json ls /tmp/mycontainer/ # config.json rootfs/ # config.json selects what namespaces to create: # "namespaces": [ # {"type": "pid"}, # {"type": "network"}, # {"type": "ipc"}, # {"type": "uts"}, # {"type": "mount"} # ] # Run the bundle: runc run mycontainer # Observe runc creating the container: strace -f runc run mycontainer 2>&1 | grep -E "clone|unshare|pivot" # clone(CLONE_NEWPID|CLONE_NEWNET|CLONE_NEWNS|...) # unshare(CLONE_NEWUTS) # pivot_root("rootfs", "rootfs/.pivot_root")

Step-by-Step: docker run nginx

Step 1: Image pull (if not cached) dockerd contacts registry.hub.docker.com Downloads manifest, then layer blobs (tar.gz) Stores layers in /var/lib/docker/overlay2/ Step 2: Container setup Creates writable layer on top of image layers (overlayfs) Assigns container ID (64-char hex) Creates network namespace + veth pair Connects veth to docker0 bridge Assigns IP from 172.17.0.0/16 Step 3: runc launches process New PID namespace (nginx gets PID 1 inside) New mount namespace (sees only overlayfs root) New UTS namespace (hostname = container ID) New IPC namespace cgroup created: /sys/fs/cgroup/.../docker-ID.scope seccomp profile applied Capabilities dropped (kept: ~14 of 40) pivot_root to overlayfs mountpoint exec /docker-entrypoint.sh nginx -g 'daemon off;' Step 4: Container running containerd-shim owns the process dockerd monitors via events Port mapping: iptables DNAT rule created

Inspecting a Running Container's Internals

# Get container PID on host docker inspect myapp --format '{{.State.Pid}}' # 12345 # See its namespaces: ls -la /proc/12345/ns/ # Enter its namespaces (like docker exec): nsenter -t 12345 --pid --net --mount bash # See overlayfs layers: docker inspect myapp --format '{{.GraphDriver}}' # UpperDir, LowerDir, MergedDir, WorkDir paths # Check its cgroup: cat /proc/12345/cgroup # 0::/system.slice/docker-abc123.scope # View applied seccomp profile: cat /proc/12345/status | grep Seccomp # Seccomp: 2 (2 = BPF filter active) # Check capabilities: cat /proc/12345/status | grep Cap

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.