Autonomous Vehicles

Self-driving cars are the most complex robots ever deployed at scale. They combine nearly every concept from this roadmap — perception with cameras and LiDAR, real-time SLAM, deep learning, path planning, and PID control — into a system that must operate safely at highway speeds in unpredictable environments. Understanding how they work gives you a master-class view of applied robotics.

SAE Autonomy Levels — Where Are We?

The SAE defines 6 levels of driving automation (Level 0 to Level 5):

Level 0–2: Human in control

Level 0: no automation (regular car). Level 1: single assistance (adaptive cruise control or lane keeping, not both). Level 2: simultaneous assist for speed and steering (Tesla Autopilot, GM Super Cruise) — but the human must remain alert and ready to take over at any moment. Most cars sold today are Level 1–2.

Level 3: Conditional automation

The car drives itself in specific conditions (highway, good weather) but alerts the driver to take over when the conditions exceed its capability. Mercedes DRIVE PILOT is the first Level 3 system approved for public roads in the US (at up to 40mph on certain highways). The human can look away — but must be ready to take over in ~10 seconds when prompted.

Level 4–5: True autonomy

Level 4: fully self-driving within a specific geographic area (geofenced) and operating conditions. Waymo One (robotaxi in San Francisco, Phoenix, Los Angeles) is Level 4. Level 5: fully autonomous in all conditions, everywhere — the "robot chauffeur." No Level 5 vehicles exist commercially. Most experts believe we won't see Level 5 for at least a decade.

The Sensor Stack

Every autonomous vehicle combines multiple complementary sensors to build a complete picture of its surroundings.

LiDAR — the 3D mapper

Rotating LiDAR units (Waymo, Zoox) or solid-state LiDARs (newer vehicles) produce dense 3D point clouds at 10–20 Hz. Objects are detected by clustering point cloud returns and tracking them across frames. LiDAR gives precise depth regardless of lighting but is expensive and doesn't read signs or lane markings well.

Cameras — the classification engine

8–12 cameras cover all angles around the vehicle. They read traffic lights, identify pedestrian intentions, and detect lane markings — things LiDAR can't do. Tesla's camera-only approach (no LiDAR) argues that humans drive with eyes only, so cameras are sufficient. Most other AV companies disagree, arguing LiDAR provides safety redundancy that cameras can't match.

Radar — the weather warrior

Radar penetrates rain, fog, and snow — where cameras and LiDAR degrade. It measures relative velocity of objects very accurately via Doppler effect. Cheap and reliable, but low spatial resolution (it can't determine the shape of a detected object). Used as a safety layer alongside cameras and LiDAR in almost all production AV stacks.

The Software Stack

Perception — what's around me?

Sensor fusion combines LiDAR, camera, and radar data into a unified 3D world model. Deep learning detects and classifies objects (cars, pedestrians, cyclists, construction zones). Tracking algorithms maintain consistent IDs across frames. The output: a list of objects with positions, velocities, and classifications, updated at 10–20Hz.

Prediction — what will they do?

Human behavior is uncertain. Will that pedestrian step into the road? Will that car merge? Prediction models forecast the likely future trajectories of all nearby agents — often outputting multiple hypotheses with probabilities. Transformer-based prediction models (like Waymo's Motion Transformer) have dramatically improved prediction accuracy in recent years.

Planning — what should I do?

The planner takes the world model + predictions and computes a safe, comfortable trajectory for the vehicle. It balances: staying in lane, obeying traffic laws, maintaining safe following distance, completing the route efficiently, and responding to edge cases. Most planners use a combination of rule-based logic (obey traffic signals) and optimization (find the trajectory that minimizes a cost function).

HD Maps

High-Definition maps contain centimeter-precise lane geometry, traffic signal locations, speed limits, and road topology. The AV localizes itself against the HD map to know exactly which lane it's in — GPS alone isn't precise enough. Building and maintaining HD maps for all roads is expensive and doesn't scale well — a key limitation of current AV approaches and why Tesla's map-free approach (using only real-time perception) is strategically important.

Frequently Asked Questions

Why is full self-driving so hard?

The "long tail" problem: 99.9% of driving is routine and easy. The remaining 0.1% — unusual edge cases, unpredictable human behavior, novel road configurations, bad weather — is where autonomous systems fail. There are effectively infinite possible scenarios, and a system must handle all of them safely. Reaching the last 0.01% of safety is exponentially harder than the first 99.9%.

LiDAR vs. camera-only: who is right?

Both approaches have genuine arguments. Camera-only (Tesla): cheaper, scalable to millions of vehicles, sufficient if deep learning is good enough. LiDAR (Waymo, Cruise): more reliable depth sensing, better in edge cases, but expensive and geofenced. The camera-only approach requires far better computer vision than LiDAR approaches — the debate reflects different bets on AI progress vs. sensor cost reduction.

Is Waymo actually safe?

By the available data: yes, significantly safer than average human drivers. Waymo published a study showing its robotaxis had a 6.7x lower rate of injury-causing crashes per mile than human drivers in comparable conditions. This doesn't mean perfect — there are still incidents — but the safety argument is becoming convincing as miles accumulate.

How do I get into autonomous vehicle engineering?

The core skills: strong C++ (for performance-critical perception and planning code), deep learning (PyTorch, computer vision), sensor fusion, and motion planning. ROS experience is valuable, as many AV companies use ROS-based tooling internally. Competitions like the Indy Autonomous Challenge (full-scale racing) and software challenges from nuPlan and Lyft provide entry points. Internships at Waymo, Motional, and Aurora are highly competitive but transformative.

Frequently Asked Questions

What will I learn here?

This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.

How should I use this page?

Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.

What should I read next?

Use the navigation below to continue to the next lesson or explore related topics.