Real-Time Linux

Standard Linux is not real-time. A process asking for 1ms response might get 50ms because an interrupt handler or memory allocator ran instead. Real-time Linux (PREEMPT_RT patch) makes the kernel preemptible almost everywhere — so a high-priority task can always interrupt whatever the kernel is doing and run within microseconds.

Real-Time vs Standard Linux

What does "real-time" actually mean? Real-time means deterministic — a task completes within a guaranteed deadline. It does NOT mean fast. A real-time system might respond in 500 microseconds; a non-real-time system might usually respond in 100 microseconds but occasionally spike to 50ms. The spike is what kills real-time applications: a robotic arm mis-times, audio clicks, a control loop misses a sample.
Standard Linux latency sources: - Interrupt handlers: run with preemption disabled - Spinlocks: held with preemption disabled (any holder blocks everyone) - Memory allocation: can trigger reclaim, writeback, compaction - RCU grace periods: block preemption on some paths - Timer granularity: default 4ms HZ=250 tick, not microsecond Result: worst-case latency = hundreds of milliseconds on standard kernel PREEMPT_RT changes: - Spinlocks → sleeping mutexes (can be preempted while waiting) - Interrupt handlers → kernel threads (can be preempted, have priority) - All critical sections preemptible except tiny raw_spinlock regions - High-resolution timers always active Result: worst-case latency = tens of microseconds on tuned RT system

Kernel Preemption Levels

ConfigPreemptionWorst-Case LatencyUse Case
PREEMPT_NONEOnly on voluntary yieldSecondsServer throughput
PREEMPT_VOLUNTARYAt explicit preempt points~100msDesktop default
PREEMPTAny non-critical kernel code~10msLow-latency desktop
PREEMPT_RT (full RT)Almost everywhere~50-200µsIndustrial, audio, robotics
# Check current preemption model: uname -v | grep -i preempt # or: cat /boot/config-$(uname -r) | grep PREEMPT # CONFIG_PREEMPT_RT=y ← full RT kernel # Many distros ship RT kernels: # Ubuntu: linux-image-lowlatency or linux-realtime # Debian: linux-image-rt-amd64 # Fedora: kernel-rt (from CentOS Stream RT repo) apt install linux-image-lowlatency # Ubuntu low-latency (not full RT but better) # PREEMPT_RT merged into mainline Linux 6.12 (Dec 2024) # No longer a separate patch for kernel >= 6.12

Measuring Latency with cyclictest

How do you measure if your system is actually real-time? cyclictest creates a thread that sleeps for a precise interval, wakes up, measures how late it woke, and records the distribution. On a standard kernel you'll see occasional spikes of milliseconds. On a tuned RT kernel with PREEMPT_RT, the maximum latency should stay under 200 microseconds.
# Install rt-tests: apt install rt-tests # Basic cyclictest (run as root for priority): cyclictest --mlockall --smp --priority=99 --interval=200 --distance=0 --duration=60s # --mlockall: lock all memory (prevent page faults during test) # --smp: test all CPUs # --priority=99: SCHED_FIFO priority 99 (highest RT) # --interval=200: wake every 200 microseconds # --duration=60s: run for 60 seconds # Output example: # T: 0 (12345) P:99 I:200 C: 300000 Min: 8 Act: 12 Avg: 11 Max: 47 # T: 1 (12346) P:99 I:200 C: 300000 Min: 9 Act: 11 Avg: 11 Max: 52 # Min/Avg/Max in microseconds # Standard kernel Max: often 1000-50000µs # Good RT kernel Max: under 200µs # Stress the system while measuring (worst case): stress-ng --cpu 0 --io 4 --vm 2 --vm-bytes 128M & cyclictest --mlockall --smp --priority=99 --interval=200 --duration=300s

CPU Isolation for Real-Time Tasks

# isolcpus: tell kernel to never schedule normal tasks on these CPUs # /etc/default/grub: GRUB_CMDLINE_LINUX="isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3" # isolcpus=2,3: CPUs 2 and 3 excluded from general scheduler # nohz_full=2,3: disable timer tick on idle RT CPUs (reduces interruptions) # rcu_nocbs=2,3: move RCU callbacks off these CPUs update-grub && reboot # After reboot: pin your RT task to isolated CPUs: taskset -c 2,3 cyclictest --priority=99 --interval=200 # Or via cgroups (cpuset): mkdir /sys/fs/cgroup/rt-tasks echo "2,3" > /sys/fs/cgroup/rt-tasks/cpuset.cpus echo "0" > /sys/fs/cgroup/rt-tasks/cpuset.mems echo $$ > /sys/fs/cgroup/rt-tasks/cgroup.procs # Verify isolation: cat /sys/devices/system/cpu/isolated # 2-3 # Move all IRQs away from RT CPUs: for irq in /proc/irq/*/smp_affinity; do echo 3 > $irq 2>/dev/null # hex 3 = CPU 0 and 1 only done

Real-Time Scheduling Policies

# Linux RT scheduling policies: # SCHED_FIFO: run until done or blocked (no time slicing), priority 1-99 # SCHED_RR: like FIFO but with time slices between same-priority tasks # SCHED_DEADLINE: task declares runtime/deadline/period — kernel guarantees slots # Set RT priority in code (C): # struct sched_param sp = { .sched_priority = 99 }; # sched_setscheduler(0, SCHED_FIFO, &sp); # Set RT priority from shell: chrt -f 99 myapp # SCHED_FIFO priority 99 chrt -r 50 myapp # SCHED_RR priority 50 # Change priority of running process: chrt -f -p 99 1234 # change PID 1234 to FIFO priority 99 # Check scheduling policy of a process: chrt -p 1234 # pid 1234's current scheduling policy: SCHED_FIFO # pid 1234's current scheduling priority: 99 # Prevent RT tasks from starving non-RT: cat /proc/sys/kernel/sched_rt_runtime_us # 950000 (RT tasks get 95% of CPU time, 5% reserved for non-RT) # Set to -1 to give RT tasks 100% (dangerous on non-isolated CPUs)

System Tuning for Real-Time

# Memory locking (prevent page faults in RT tasks): # In code: mlockall(MCL_CURRENT | MCL_FUTURE) # Pre-fault stack: ulimit -l unlimited # allow locking all memory # Disable CPU frequency scaling (variable frequency = variable latency): # /etc/default/grub: # GRUB_CMDLINE_LINUX="processor.max_cstate=1 intel_idle.max_cstate=0" # Forces CPU to stay at max frequency — no power-saving transitions # Set CPU governor to performance: for cpu in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo performance > $cpu done # Disable SMT/hyperthreading (sibling thread causes cache interference): echo off > /sys/devices/system/cpu/smt/control # Disable NUMA balancing (automatic page migration adds latency): echo 0 > /proc/sys/kernel/numa_balancing # Disable transparent huge pages (THP allocation is non-deterministic): echo never > /sys/kernel/mm/transparent_hugepage/enabled # Audio RT example (/etc/security/limits.d/audio.conf): # @audio - rtprio 99 # @audio - memlock unlimited # @audio - nice -20

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.