Skip to content

Production Perception Systems: What Actually Works in the Real World

Overview

This report analyzes perception systems deployed in production autonomous vehicles -- not research prototypes, but systems carrying real passengers and cargo on public roads, airport aprons, and managed sites. The goal is to extract concrete perception lessons that transfer across AV domains, with airside autonomy treated as one reference ODD rather than the default deployment lens.

Related deep dives: LiDAR Artifact Removal Techniques, Weather Robustness Datasets, and LiDAR Artifact Removal Validation.


1. Waymo Sensor Suite

5th Generation (Deployed 2020-2025)

The 5th-generation Waymo Driver established the template for production L4 perception:

Sensor TypeCountKey Details
Cameras2916 on roof structure, additional forward-facing; high-dynamic-range with thermal stability
LiDAR51x 360-degree rooftop (>300m range), 4x perimeter ("Honeycomb" Laser Bear) at front, rear, and over front tires
RadarMultipleCustom imaging radar with redesigned architecture and signal processing

The perimeter LiDARs ("Honeycomb" / Laser Bear) are notable: 95-degree vertical FOV, 360-degree horizontal FOV, and zero minimum range -- meaning they can detect objects directly against the sensor housing. This eliminates the classic LiDAR blind spot directly around the vehicle, which is critical for urban driving near pedestrians and cyclists.

The 5th-gen long-range cameras and 360 vision system could identify pedestrians and stop signs at greater than 500 meters.

6th Generation (Deployed February 2026)

The 6th generation represents a major optimization: 42% fewer sensors while improving performance.

Sensor TypeCountKey Details
Cameras13Down from 29; uses next-gen 17-megapixel imager
LiDAR4Custom-designed chips and optics, built in California
Radar6Imaging radar creating dense temporal maps
AudioArrayExternal Audio Receivers (EARs) -- microphone array

Why this configuration works:

  • 17MP camera imager: A generation ahead of other automotive cameras in resolution, dynamic range, and low-light sensitivity. By going to higher resolution, Waymo could halve the camera count while maintaining coverage.
  • Custom LiDAR: Designed and fabricated in-house since 2011. The 6th-gen LiDAR specifically improves weather penetration and eliminates point cloud distortion near highly reflective signs (a known failure mode on freeways).
  • Imaging radar: Creates dense temporal maps that track distance, velocity, and size in all lighting and weather conditions. Leverages affordable radar chipsets for cost reduction.
  • Redundancy philosophy: Three complementary modalities (camera, LiDAR, radar) with overlapping fields of view up to 500 meters. No single sensor failure can blind the system.
  • Modular cleaning: Waymo can swap sensor cleaning components per deployment city (e.g., different cleaning systems for cold climates vs. desert).

Cost: Waymo has significantly reduced per-vehicle sensor cost through custom silicon, strategic placement, and industry-wide LiDAR cost reduction (from $75K per unit in early 2010s to $1K-5K today).

Key Waymo Insight

Waymo has been doing autonomous driving longer than anyone and concluded they need all three sensing modalities. Their trajectory has been toward fewer, better sensors rather than more sensors -- but they have never dropped a modality entirely.


2. Cruise Sensor Suite

Configuration (Chevy Bolt-Based Vehicles)

Sensor TypeCountKey Details
Cameras16Video cameras distributed around vehicle
LiDAR5+Roof-mounted navigation package; Cruise acquired Strobe (LiDAR company) in 2017
Radar21Extensive radar coverage

Cruise used the densest sensor configuration of any production robotaxi -- 40+ sensors total, with 40% of hardware custom for self-driving. They believed LiDAR was essential for urban driving and pedestrian detection, leading to their acquisition of Strobe to bring LiDAR development in-house.

The October 2023 Incident and Lessons

On October 2, 2023, a pedestrian was struck by a human-driven vehicle and thrown into the path of a Cruise robotaxi. What followed exposed critical perception system gaps:

  1. Sensors worked correctly: The post-incident investigation found no issues with sensor hardware or maintenance. The sensors detected the initial collision event.
  2. Software misclassification: The perception software classified the impact as a "non-critical side collision" and triggered a "minimal risk condition" -- pulling to the curb rather than emergency stopping.
  3. Undercarriage blind spot: Once the pedestrian was beneath the vehicle, no sensors could detect their presence. The vehicle dragged the pedestrian 6 meters (20 feet) at 7 mph.
  4. Wheel speed anomaly as final signal: A wheel speed sensor detected one wheel spinning on the pedestrian's body, which finally halted the vehicle -- an unintended safety mechanism.

Lessons for the Industry

  • Sensor placement must account for all scenarios: Roof-mounted sensors cannot see directly beneath the vehicle. The Cruise incident proved that undercarriage perception is not optional.
  • Classification confidence thresholds matter enormously: The system's decision to "relocate" rather than "stop" after any unusual contact was a software policy failure, not a sensor failure.
  • Organizational transparency: Cruise's delayed disclosure to regulators led to permit revocation and ultimately contributed to GM shutting down Cruise entirely in December 2024, writing off approximately $10 billion in investment.
  • Production is a marathon: Regulators must be treated as partners. The industry learned that rushing deployment and hiding problems is existentially dangerous for an AV program.

3. Zoox Sensor Suite

Purpose-Built Vehicle Design

Zoox is unique among robotaxi companies: they designed their vehicle from scratch for autonomous driving, rather than retrofitting an existing car. This allowed optimal sensor placement.

Sensor TypeDetails
CamerasVisual cameras at all four corners
LiDARMultiple Hesai LiDARs (partnership announced 2024)
RadarIntegrated at each corner pod
InfraredLong-wave infrared (LWIR) cameras -- FLIR thermal imaging
AudioMicrophone arrays

Key design features:

  • Bidirectional symmetry: The vehicle has no "front" or "back." Two identical sensor configurations at each end, plus four-wheel steering, allow the vehicle to operate equally well in both directions.
  • 270-degree FOV per corner: Each of the four corners achieves 270-degree coverage with overlapping camera, LiDAR, and radar. Combined, the vehicle achieves true 360-degree coverage with no blind spots.
  • LWIR thermal cameras: Zoox is the only production robotaxi using long-wave infrared sensors. These detect heat signatures and work in complete darkness, fog, and smoke -- conditions where cameras and LiDAR degrade.
  • 150-meter all-direction range: The sensor suite can perceive the environment 150 meters in every direction.
  • Compute: Multiple NVIDIA GPUs process the full sensor suite in real time.

Why Purpose-Built Matters

By not retrofitting, Zoox avoids the compromises other companies face: sensors don't need to fit around existing body panels, mirror housings, or windshield structures. The symmetric design eliminates the "rear blind spot" problem that plagues every retrofitted vehicle. The inclusion of LWIR is a direct lesson -- thermal imaging catches what cameras and LiDAR miss (pedestrians in dark clothing at night, animals, overheating vehicle components).


4. Tesla Vision-Only Approach

Current Production System (Hardware 3, ~4M+ Vehicles)

ComponentDetails
Cameras8 cameras, 1.2 megapixels each, 360-degree coverage up to 250 meters
RadarRemoved from production vehicles starting 2021
UltrasonicRemoved starting October 2022
ComputeFSD Computer (HW3): custom SoC, 8 GB RAM, 64 GB storage

Camera placement (HW3):

  • 3x forward-facing (main, wide, narrow/telephoto)
  • 2x side-facing (B-pillar, forward-looking)
  • 2x rear-quarter (above rear wheel wells)
  • 1x rear-facing (above license plate)

Hardware 4 (Rolling Out in New Vehicles)

ComponentDetails
CamerasUp to 12 camera ports; 5.4 megapixel sensors (2896x1876), 4.5x resolution increase
RadarHD radar ("Phoenix") reintroduced
ComputeFSD Computer 2: Samsung 7nm SoC, 16 GB RAM, 256 GB storage

HW4 adds new camera positions: B-pillar cameras that see sideways and slightly forward, C-pillar cameras that see sideways and slightly backward, and a wider front-facing field of view.

How Camera-Only Perception Works

Tesla's FSD v12 represents a fundamental architectural shift: replacing 300,000 lines of hand-coded rules with end-to-end neural networks.

Processing pipeline:

  1. Image rectification: Corrects for camera calibration distortions
  2. Feature extraction: RegNet (residual network) backbone processes images at multiple scales
  3. Transformer fusion: Multi-scale features are fused via transformer attention into a unified "Vector Space"
  4. Bird's Eye View (BEV) transformation: 2D camera images are transformed into a 3D BEV representation
  5. Occupancy Networks: 3D voxel grid determines which cells in space are occupied -- replacing traditional bounding-box detection with volumetric understanding
  6. End-to-end output: The neural network directly outputs steering, acceleration, and braking commands

Key technical innovations:

  • 48 neural networks running in concert
  • Occupancy Networks running at 100+ FPS, classifying voxels as occupied/free, static/moving
  • Self-supervised depth estimation from monocular cameras -- the network learns depth without LiDAR ground truth
  • Imitation learning from 10 million driving video clips, trained on 70,000 GPU hours per cycle

Performance vs. LiDAR Approaches

Advantages of vision-only:

  • Dramatically lower per-vehicle sensor cost (~$200-300 for all cameras vs. $5K-50K for LiDAR suites)
  • Scales to millions of vehicles = massive training data advantage (4M+ vehicles collecting data)
  • No mechanical LiDAR failure modes (motor bearing wear, laser degradation)
  • Camera resolution improves rapidly with commodity semiconductor advances

Disadvantages vs. LiDAR:

  • No direct distance measurement -- depth must be inferred, introducing uncertainty
  • Degraded performance in direct sunlight, glare, and low-light conditions
  • Cannot match LiDAR's centimeter-level range accuracy for nearby objects
  • Tesla FSD remains SAE Level 2 (supervised) -- no production deployment without driver supervision
  • Radar was reintroduced in HW4, suggesting pure vision has practical limits

Critical observation: No company operating fully driverless (L4) vehicles uses a camera-only approach. Tesla's massive data advantage has not yet produced a system that operates without human supervision. This is the strongest evidence that cameras alone may be insufficient for L4 autonomy.


5. comma.ai Sensor Suite

Hardware Configuration

GenerationCamerasOther SensorsProcessorPrice
comma 3X3 (OmniVision OX03C10)Uses vehicle's existing radarSnapdragon 845$1,250
comma four3 (same triple-camera system)Uses vehicle's existing radarSnapdragon 845 MAX~$1,000

Camera layout:

  • 2x forward-facing (road-facing stereo pair for depth perception)
  • 1x interior-facing (driver monitoring)

The device mounts behind the rearview mirror and is one-fifth the size of the comma 3X (which itself fits in a small windshield-mounted unit).

How comma.ai Achieves Performance with Minimal Sensors

Supercombo model: A single neural network trained on 10+ million miles of real-world driving data from the comma.ai user fleet. Unlike traditional AV stacks with separate perception, prediction, and planning modules, openpilot uses a system-level end-to-end design that predicts the car's trajectory directly from camera images.

Radar fusion: When the host vehicle has radar (many newer cars do), openpilot fuses neural network estimates with radar data for lead vehicle detection, improving longitudinal (speed) control.

Key insight: comma.ai proves that a well-trained neural network with a single good camera can achieve L2 driving assistance comparable to manufacturer ADAS systems. However, like Tesla, this is supervised L2 -- the driver must remain attentive. The lesson is not that minimal sensors suffice for L4, but that sensor count is less important than software quality for L2.

Supported Vehicles

openpilot supports 300+ vehicle models, leveraging each vehicle's existing CAN bus interface and (when available) factory-installed radar.


6. TractEasy / reference airside AV stack -- Production Airside Vehicles

TractEasy (EasyMile)

TractEasy is the most widely deployed autonomous airside tow tractor, operating at airports including Changi (Singapore), Kansai (Japan), and Frankfurt (Lufthansa Cargo).

Sensor TypeDetails
LiDARMultiple units (likely Velodyne, based on CES 2022 partnership); 3D mapping
CamerasHD stereo cameras
RadarIntegrated
GPS/GNSSSatellite navigation with RTK correction
IMUInertial measurement unit for dead reckoning
Wheel encodersOdometry
V2XVehicle-to-infrastructure communication (on-board units)
Connectivity3G/4G modem, Wi-Fi

Key specifications:

  • Load capacity: 25 tons
  • Localization accuracy: within 5 centimeters
  • Navigation: centimeter-precise localization at all times
  • Perception: wide-range obstacle detection with predictive control
  • V2X: Communication at intersections and pedestrian crossings

Operational approach: TractEasy operates on pre-mapped routes with centimeter-level localization. The LiDAR and cameras provide obstacle detection, while GPS/IMU/encoders provide localization. V2X communication adds awareness of infrastructure signals (gates, traffic lights, aircraft movement areas).

reference airside AV stack (autonomous baggage/cargo tug, autonomous cargo vehicle)

reference airside AV stack is deploying autonomous baggage and cargo vehicles at airports including Heathrow, Schiphol, and multiple global trial sites.

Sensor TypeDetails
LiDARMultiple units on A-frame structure above cargo platform
Cameras360-degree cameras
GPS/GNSSHigh-precision positioning
IMUInertial measurement unit

Key operational features:

  • Navigates pre-programmed mapped airport areas
  • New rainfall algorithm differentiates rain drops from obstacles (critical for LiDAR in rain)
  • Operates in up to 50mm/hour precipitation with specialized sensor housing
  • Custom weather-hardened LiDAR enclosures protect sensors in intense conditions

Lessons from Airside Deployments

  1. Pre-mapping is standard: Both TractEasy and reference airside AV stack operate on pre-mapped routes, not in unknown environments. This dramatically simplifies the perception problem -- the vehicle knows what the world should look like and only needs to detect deviations (obstacles, vehicles, people).
  2. Weather is the primary challenge: Both companies have invested heavily in wet-weather algorithms. reference airside AV stack specifically developed new software to distinguish rain from obstacles in LiDAR data.
  3. Lower speeds reduce sensor requirements: Airside vehicles typically operate at 10-25 km/h, requiring much shorter perception ranges than highway vehicles. This means less expensive, shorter-range sensors can suffice.
  4. V2X is critical for airside: Unlike road vehicles, airside vehicles can leverage infrastructure communication for coordination at intersections, gate areas, and aircraft stands.
  5. Testing has been extensive: Hundreds of hours of testing across different weather conditions without weather-related failures, though extreme conditions remain challenging.

7. LiDAR in Production: Brands, Deployments, and Reliability

Which LiDAR Brands Are Actually Deployed

CompanyTechnologyKey CustomersStatus
HesaiSolid-state (AT128, OT128)Zoox, Baidu Apollo, EasyMile, SAIC, plus 9 of top 10 L4 companiesMarket leader; 380K+ AT128 units shipped by Q2 2024
Waymo (in-house)Custom (Laser Bear, Honeycomb, 6th-gen)Waymo onlyIn-house since 2011; not sold externally since 2021
Luminar1550nm IrisVolvo EX90 (standard), Volvo ES90, Polestar 3First automotive-grade LiDAR in mass production passenger car (2024)
RoboSenseMEMS solid-state (M1, M1 Plus)Chinese OEMs, North American automakersISO 26262 ASIL-B certified; sub-100ms fault detection
Ouster (merged with Velodyne)Digital spinning (OS0, OS1, OS2)Robotics, mapping, industrial; some AV programsIP68/IP69K rated; passed GM automotive reliability standards
InnovizSolid-stateBMW, VolkswagenAutomotive-grade solid-state
AevaFMCW (4D: position + velocity)Daimler Truck/Torc (production 2026-2027), May MobilityFirst automotive-grade FMCW LiDAR; 500m range, 4M points/sec

Cost Evolution

PeriodCost per UnitNotes
Early 2010s$75,000 - $100,000Velodyne HDL-64E era
Mid-2010s$10,000 - $20,000Solid-state emergence
2020-2025$1,000 - $5,000Automotive-grade viability; Luminar targeting $500-1,000
2026-2030 (projected)$300 - $500With weather hardening improvements
Post-2030 (projected)$200 - $300At mass production scale

Reliability Data

Ouster OS1:

  • IP68 and IP69K rated (submersible and high-pressure wash resistant)
  • Passed GMW3172 (General Motors automotive reliability standard)
  • Extreme shock and vibration tested
  • Specific MTBF figures not publicly disclosed

RoboSense M1:

  • ISO 26262 compliant (ASIL-B, SIL-2)
  • 97% diagnostic coverage with sub-100ms fault self-detection
  • Power consumption under 15W
  • Physical size: 108mm x 110mm x 45mm (world's thinnest automotive LiDAR at launch)

Hesai AT128:

  • 380,000+ units shipped without major reliability issues reported
  • "Quality and reliability have been highly praised by automotive OEMs"
  • Market-proven in production vehicles

Luminar Iris:

  • 1550nm wavelength (eye-safe at higher power = longer range)
  • 600-meter max range, 120-degree FOV, 26-degree dynamic vertical FOV
  • First truly mass-production passenger car LiDAR (Volvo EX90)
  • Vehicles with Luminar outperform camera+radar vehicles by ~40% in safety studies

FMCW LiDAR: The Next Generation

Aeva's FMCW (Frequency Modulated Continuous Wave) technology represents a paradigm shift:

  • Measures instant velocity per point (not just position) -- the "4th D" in 4D LiDAR
  • Uses continuous low-power laser beam, measuring frequency shift (Doppler effect)
  • Up to 4 million points per second at 500-meter range
  • Detects pedestrians beyond 350 meters, ground hazards beyond 200 meters
  • Better rain/fog performance than time-of-flight LiDAR due to continuous wave nature
  • Production deployment: Daimler Truck autonomous Freightliner Cascadia (2026-2027), May Mobility Toyota Sienna (through 2028)

Key LiDAR Finding

No known fully driverless fleet operates without LiDAR. This is the single most important data point. Waymo, Zoox, Baidu Apollo, Cruise (when operational) -- all use LiDAR. The technology is proven for L4 autonomy. The question is no longer "does LiDAR work?" but "which LiDAR is most reliable and cost-effective for the specific deployment environment?"


8. 4D Imaging Radar in Production

Continental ARS548 RDI

The most widely referenced 4D imaging radar in production/near-production:

SpecificationValue
Frequency77 GHz
Maximum detection rangeUp to 300m (extended configurations to 1500m)
Minimum range0.2 m
Scan rate20 Hz (20 scans per second)
Output modesDetection (point cloud, similar to LiDAR) and Object (classified targets)
Object classification120+ cluster classifications
InterfaceBroadR-Reach Ethernet, 100 Mbit/s
ModulationPulse compression with advanced frequency modulation
MeasurementsRange, azimuth, elevation, and velocity simultaneously

Continental has reached 200 million radar sensors produced overall, with 4D imaging radar orders worth approximately 1.5 billion euros secured in Q1 2024 alone, with production rollouts scheduled for 2026-2027.

ZF Imaging Radar

SpecificationValue
Channels192 (16x more resolution than typical automotive radar)
Frequency77 GHz FMCW
Aperture angle120 degrees
Range350 meters
Dimensions4D: range, velocity, azimuth, elevation

ZF's imaging radar is in production with SAIC Motor's R-Series vehicles in China -- one of the first series-production deployments of 4D imaging radar.

What 4D Radar Adds

Traditional automotive radar provides range and velocity but poor angular resolution (cannot distinguish between adjacent objects). 4D imaging radar adds:

  1. Elevation measurement: Can distinguish between an overpass and a vehicle ahead, or between a speed bump and an object to avoid -- a classic failure mode of traditional radar.
  2. Point cloud output: Generates LiDAR-like point clouds, enabling 3D object detection algorithms to run on radar data.
  3. All-weather performance: Unlike LiDAR and cameras, radar penetrates rain, fog, snow, and dust with minimal degradation. Detection range reduction in severe rain is ~45%, compared to 56%+ for LiDAR.
  4. Instant velocity: Every point includes Doppler velocity measurement, making moving object detection trivial.
  5. Low cost: Automotive radar has already achieved mass-production economics at ~$50-200 per unit.

4D Radar Market Trajectory

By 2025, 4D radar had reached approximately 11.4% of the automotive radar market, transitioning from niche to mainstream. This is significant for airside vehicles: 4D radar could be the most cost-effective way to add robust all-weather perception as a complement to LiDAR and cameras.


9. Sensor Cleaning Systems

Why Cleaning Matters

A dirty sensor is a blind sensor. In production, sensor cleaning is not optional -- it is a safety-critical system. Every production L4 vehicle has integrated cleaning mechanisms.

Waymo's Approach

Waymo uses a multi-layered cleaning strategy:

  1. Passive aerodynamic deflection: Small particles (fog droplets, fine dust) are deflected away from sensor apertures through aerodynamic body design. Waymo conducts computational fluid dynamics (CFD) simulations to optimize airflow around sensors before deployment.
  2. Active wipers: Mechanical wipers for camera lenses and LiDAR windows, similar to miniature windshield wipers.
  3. Air puffers: Compressed air jets blast debris off sensor surfaces, effective for mud, spray, and insects.
  4. Bandpass filters: Cameras are placed behind glass with optical bandpass filters that reject certain wavelengths of stray light.
  5. Heating elements: Keep sensors clear of ice and condensation in cold climates.
  6. Adaptive cleaning frequency: The Waymo Driver's LiDAR data helps understand weather conditions (fog density, precipitation level) and adjusts cleaning frequency automatically.
  7. City-specific configurations: Waymo swaps cleaning components based on deployment city -- different systems for Phoenix (dust) vs. San Francisco (fog/rain) vs. cold-weather cities (ice/snow).

Industry-Wide Cleaning Technologies

The sensor cleaning industry has grown to 230+ companies. Key technologies include:

TechnologyHow It WorksBest For
Telescopic nozzlesExtend, spray fluid, retractCamera lenses, LiDAR windows
Ring nozzlesSurround sensor with spray ringCylindrical LiDAR housings
Mini wipersCurvature-adaptive mechanical wipeFlat and curved camera surfaces
Air jets / puffersCompressed air blastDust, light debris, drying after wash
Ultra-high-pressure pumpsSame cleanliness in half the time, 30% less fluidProduction-scale cleaning efficiency
Defrosters / heatersResistive heating elementsIce, frost, condensation
Hydrophobic coatingsSelf-cleaning nano-coatingsReducing cleaning frequency
Rotating coversPhysical covers rotate away debrisHeavy contamination environments

Key supplier: Tensor (now a major AV sensor cleaning specialist) equips autonomous driving cameras and LiDARs with integrated mini wipers, retractable nozzles, and defrosters as standard.

Implications for Airside

Airport environments present unique cleaning challenges: jet exhaust residue, de-icing fluid overspray, rubber tire dust, hydraulic fluid mist, and bird droppings. Any airside AV must have robust sensor cleaning -- passive aerodynamic design plus active washing and heating. This is a non-negotiable production requirement.


10. Sensor Calibration in Production

The Three Types of Calibration

TypeWhat It CalibratesStability
IntrinsicInternal sensor parameters (focal length, distortion coefficients, beam patterns)Mostly stable; auto-calibrated by modern sensors
ExtrinsicSpatial relationship between sensors (rotation, translation relative to vehicle body)Drifts due to vibration, thermal cycling, impacts
TemporalTime synchronization between sensors operating at different ratesCritical for fusion; typically handled by hardware sync signals

Why Extrinsic Calibration Drifts

Production vehicles experience continuous forces that shift sensor positions:

  • Mechanical vibration: Road surface, engine/motor vibration, curb impacts
  • Thermal expansion: Metal mounting brackets expand/contract with temperature (can be 20-30 degree C swing over a day)
  • Minor impacts: Speed bumps, potholes, minor contact events
  • Long-term fatigue: Bracket fasteners gradually loosen over thousands of operating hours

Even a fraction of a degree of rotation error in a camera-to-LiDAR extrinsic calibration can cause meter-level projection errors at 100 meters -- enough to misplace a pedestrian into the wrong lane.

Production Calibration Strategies

Factory calibration: Initial calibration in a controlled environment using target boards, calibration patterns, and precise fixtures. Establishes the baseline extrinsic parameters.

Service calibration: Periodic re-calibration at service centers using target-based methods. Most OEMs want to minimize this -- customers don't want to bring vehicles in for recalibration.

Online (runtime) calibration: The holy grail. Systems that continuously monitor and correct extrinsic drift during normal operation:

  • CalibNet and similar approaches: Neural networks that estimate extrinsic corrections from photometric and geometric consistency between camera and LiDAR data
  • Weighted moving average with outlier rejection: Smoothly updates calibration parameters while rejecting transient errors
  • Motion-based self-calibration: Uses vehicle ego-motion (from IMU/GPS) to cross-validate sensor alignment
  • Feature matching: Detects natural features (lane markings, signs, building edges) visible to both cameras and LiDAR; if they don't align, calibration has drifted

Production Reality

  • Intrinsic calibration: Largely solved. Modern LiDARs and cameras auto-compensate for day-to-day intrinsic variation.
  • Extrinsic calibration: Remains a legitimate industry concern. Online calibration works but is not yet drift-free in all conditions. Production systems use a combination of factory baseline + runtime correction + periodic service checks.
  • Design mitigation: Better sensor housings that dampen vibration, thermally stable mounting materials, and rigid mounting structures reduce the drift that calibration must compensate for.

Temporal Synchronization

Camera frames and LiDAR scans must be precisely time-aligned for fusion. In production:

  • Hardware PPS (pulse-per-second) signals from GPS provide a common time reference
  • LiDAR scan timestamps are matched to the nearest camera frame
  • Misalignment of even 10ms at 60 mph means objects have moved ~27cm between the camera and LiDAR views -- enough to cause fusion artifacts

11. Perception Pipeline in Production

Typical Production Architecture

Raw Sensor Data (Camera/LiDAR/Radar)
    |
    v
[Sensor Preprocessing] (10-50ms)
  - Image rectification, point cloud filtering, radar clustering
    |
    v
[Object Detection & Classification] (30-100ms)
  - 2D/3D bounding boxes, semantic labels
  - BEV feature extraction
    |
    v
[Sensor Fusion] (15-45ms)
  - Multi-modal feature fusion (camera + LiDAR + radar)
  - BEV-space alignment
    |
    v
[Multi-Object Tracking] (5-15ms)
  - Track-by-detection, data association
  - Track lifecycle management (creation, update, deletion)
    |
    v
[Trajectory Prediction] (20-60ms)
  - Future trajectory forecasting for all tracked agents
  - Multi-modal predictions (multiple possible futures)
    |
    v
[Planning Interface] (5-10ms)
  - Perceived world state handed to motion planner

Total perception latency budget: 85-280ms, with production systems targeting the lower end.

What Actually Runs in Production

Waymo:

  • Custom multi-task neural networks; most run in under 10ms inference time
  • Published research using PointPillars-like architectures for 3D detection from LiDAR
  • Pillar-based methods achieve 62-100 Hz in 3D detection (10-16ms per frame)
  • Multi-sensor early and late fusion with camera-LiDAR-radar
  • Separate prediction module forecasts actor trajectories, informing the planner
  • HD map integration provides prior road knowledge

Tesla (FSD v12):

  • End-to-end neural network replacing 300K lines of code
  • 48 neural networks running in concert
  • Occupancy Networks for volumetric scene understanding (100+ FPS)
  • BEV Transformer architecture fusing 8 camera views
  • Direct output of control commands (steering, acceleration, braking)
  • No explicit separate tracking or prediction modules -- the network learns these implicitly
  • Training: 10M video clips, 70,000 GPU hours per training cycle, 1.5 PB of data

Industry trend -- BEV + Transformer: The dominant production architecture is converging on:

  1. Extract features from each sensor modality
  2. Project all features into a common BEV (Bird's Eye View) representation
  3. Use transformer attention to fuse features across modalities and time
  4. Run detection, tracking, and prediction heads on the unified BEV features

This architecture (exemplified by BEVFormer, BEVFusion, TransFusion) unifies perception into a single representation, eliminating the cascading errors of traditional pipeline approaches.

Compute Requirements

Production perception requires a minimum of approximately 100 TOPS (Trillion Operations Per Second) of computing capacity. Emergency braking must execute within 100ms from detection to action at highway speeds.


12. Production Failure Modes

Camera Failure Modes

Failure ModeCauseImpactMitigation
Glare / saturationDirect sunlight, oncoming headlightsComplete loss of scene information in affected regionsHDR imaging, multiple exposure fusion
Rain / water dropletsPrecipitation on lensBlurred, distorted imagesWipers, hydrophobic coatings, air jets
FogAtmospheric scatteringReduced contrast and rangeImage dehazing algorithms, sensor fusion fallback
CondensationTemperature differential (cold exterior, warm vehicle)Fogged lensHeaters, defoggers
Low lightNight, tunnels, heavy overcastNoise, reduced detection rangeHigh-sensitivity imagers (17MP Waymo), near-IR illumination
Dirty lensRoad spray, dust, insectsPartial or complete occlusionActive cleaning systems

LiDAR Failure Modes

Failure ModeCauseImpactMitigation
Rain attenuationRaindrops scatter/absorb laser pulses56% NPC reduction at 30-40 mm/h; non-detection beyond 30m for some materialsMulti-return filtering, FMCW LiDAR
Fog attenuationAtmospheric scattering59% NPC reduction at visibility <= 50m1550nm wavelength (less fog scatter than 905nm), FMCW
Reflective surfacesMirrors, polished metal, glass facadesComplete reflection = no range data; phantom pointsDBSCAN clustering to identify mirror reflections
Snow accumulationSnow on sensor windowGradual occlusionHeating elements, wipers
Road sprayFollowing vehicles at speedDense false returns near vehicleWaymo 6th-gen specific improvements for freeway spray
Solar interferenceDirect sunlight at 905nmFalse detections1550nm wavelength avoidance, bandpass filtering
Mechanical wearSpinning LiDAR motor bearingsDegraded scan quality over timeSolid-state LiDAR (no moving parts)

Quantitative degradation data (from empirical road testing):

ConditionPoint Cloud ReductionIntensity ReductionNotes
Light rain (10-20 mm/h)MinimalMinimalGenerally acceptable
Intense rain (30-40 mm/h)Up to 56%Up to 73%Significant; detection range drops below 30m for low-reflectivity targets
Weak fog (100-150m visibility)MinimalMinimalGenerally acceptable
Thick fog (visibility <= 50m)Up to 59%Up to 73%Severe; similar impact to intense rain
Retroreflective targets26% NPC decrease at 50m in fogLess affectedBest-case scenario in bad weather
Non-reflective metal81-86% NPC decrease at 20m in fogSevereNear-complete loss for metal objects

Radar Failure Modes

Failure ModeCauseImpactMitigation
Multipath / ghost targetsReflections off guardrails, bridges, tunnelsFalse positive detectionsTemporal filtering, multi-frame consistency
Frequency interferenceMultiple FMCW radars operating nearbyNoise floor elevationCognitive waveform adaptation
Angular ambiguityResolution limitsTargets merged or mislocated4D imaging radar (192+ channels)
Rain attenuationSevere rainfallUp to 45% range reductionMore resilient than LiDAR/camera; primary all-weather sensor
Static object blindnessTraditional radar filters out stationary clutterCannot detect parked vehicles, barriers4D imaging radar with point cloud mode

How Production Systems Handle Degradation

Graceful degradation architecture:

  1. Confidence-weighted fusion: Each sensor modality contributes to the fused perception proportional to its current confidence. If cameras are fogged, their weight drops; LiDAR and radar weights increase.
  2. Cross-validation: Detections must be confirmed by at least two modalities before being acted upon. Single-modality detections are flagged as lower confidence.
  3. Operational domain restriction: If sensor degradation exceeds thresholds, the system reduces its operational capability -- slowing down, refusing lane changes, or pulling over (Minimum Risk Condition).
  4. Redundancy by design: Overlapping fields of view from different sensor types ensure no single failure creates a blind spot. Waymo's philosophy: every critical area is covered by at least two sensing modalities.
  5. Simulation-validated fallbacks: Before deployment, degradation scenarios are tested in simulation (Software-in-the-Loop). The autonomy stack is validated against progressive sensor failure to ensure safe behavior.

13. Lessons for Airside Autonomous Vehicles

The Proven Sensor Configuration

Based on every production L4 deployment analyzed, the minimum viable sensor suite for a driverless airside vehicle is:

SensorQuantityPurposeProven By
LiDAR2-4 unitsPrimary 3D perception, obstacle detection, localizationWaymo, Zoox, Cruise, TractEasy, reference airside AV stack, Baidu
Cameras4-8Object classification, sign/marking reading, visual contextAll production systems
Radar2-4 (preferably 4D imaging)All-weather perception, velocity measurementWaymo, ZF/SAIC, Continental
GPS/GNSS + IMU1 eachLocalization, dead reckoningAll production systems
V2X1+Infrastructure communication (gates, traffic, aircraft)TractEasy, airside-specific

What Is Proven vs. What Is Risky

Proven (low risk):

  • LiDAR + camera + radar sensor fusion for L4 autonomy
  • Hesai and RoboSense LiDAR for production reliability
  • Pre-mapped route operation with obstacle detection (TractEasy/reference airside AV stack model)
  • Active sensor cleaning systems (wipers, air jets, heaters)
  • 4D imaging radar as a weather-robust complement to LiDAR
  • GPS/IMU/encoder fusion for centimeter-level localization

Emerging (moderate risk):

  • FMCW 4D LiDAR (Aeva) -- better in weather but not yet mass-deployed
  • End-to-end neural network perception (Tesla FSD v12 architecture) -- powerful but unproven for L4
  • BEV Transformer fusion architectures -- rapidly maturing, industry converging on this
  • Online extrinsic calibration -- works but not perfectly drift-free
  • LWIR thermal cameras (Zoox approach) -- excellent for pedestrian detection at night but limited supplier ecosystem

Risky (high risk for L4 airside):

  • Camera-only perception without LiDAR -- no production L4 system has achieved this
  • Minimal sensor configurations (comma.ai model) -- suitable only for L2 with human supervision
  • Operation without active sensor cleaning -- will fail in real airport environments
  • Single-sensor-modality dependence -- any modality can fail in specific conditions

Airside-Specific Recommendations

  1. Start with the TractEasy/reference airside AV stack model: Pre-mapped routes, LiDAR + camera + radar + GPS/IMU. This is the most conservative, most proven approach for airside operations. Both companies have demonstrated successful airport deployments.

  2. Add 4D imaging radar early: Continental ARS548 or ZF imaging radar at $200-500/unit adds robust all-weather capability. Rain is the most common sensor-degrading condition at airports, and 4D radar is the most weather-resistant perception sensor available.

  3. Invest in sensor cleaning from day one: Airport environments are harsh -- jet exhaust, de-icing chemicals, rubber dust, bird debris. Integrated cleaning (wipers + air jets + heaters) is non-negotiable. Budget for this as a first-class system component, not an afterthought.

  4. Consider LWIR thermal cameras: Airside operations occur at night, in fog, and around hot equipment (engines, APUs). FLIR/LWIR cameras detect people by body heat regardless of lighting or weather -- a significant safety advantage that only Zoox currently deploys in production.

  5. Plan for calibration maintenance: At airside operating speeds (10-25 km/h), calibration drift is less critical than at highway speeds. But vibration from rough apron surfaces and thermal cycling (hot tarmac to cold nights) will still cause drift. Implement online calibration monitoring with periodic factory re-calibration.

  6. Design for graceful degradation: The Cruise incident proved that failure mode design is as important as normal-mode performance. For airside: if any perception subsystem degrades below threshold, the vehicle must stop and request human intervention. Never implement a "relocate" behavior when perception is uncertain.

  7. LiDAR selection: For airside, prioritize reliability and weather performance over maximum range. A 150m solid-state LiDAR (Hesai AT128 or RoboSense M1) is more than sufficient for 25 km/h operations and avoids the mechanical reliability concerns of spinning LiDAR. Consider Aeva FMCW if budget allows -- the velocity-per-point data dramatically simplifies tracking of people and vehicles on the apron.

  8. Perception latency budget: At 25 km/h, the vehicle travels ~7 meters per second. With a 200ms total perception-to-action latency, the vehicle travels 1.4 meters before reacting. This is acceptable for airside but should be validated against stopping distance requirements for the specific vehicle mass and braking capability.

The Bottom Line

The autonomous vehicle industry has converged on a clear answer: multi-modal sensor fusion (LiDAR + cameras + radar) is required for fully driverless operation. No shortcuts have worked. The debate is not about which modalities to include but about which specific sensors, how many, and how to fuse them optimally. For airside operations, the lower speeds and pre-mapped environments reduce the difficulty -- but the safety stakes (proximity to aircraft, fuel, and people) demand the same sensor redundancy philosophy that Waymo has proven over millions of miles.

Public research notes collected from public sources.