Skip to content

Occupancy Flow and 4D Occupancy Benchmarks

Last updated: 2026-05-09

Why It Matters

3D occupancy tells a planner where space is free, occupied, or unknown. Occupancy flow and 4D occupancy add time: where occupied space is moving now and where it may be in future frames. This is useful for airside autonomy because hazards are not limited to tracked road vehicles. Moving ground crew, baggage carts, tugs, aircraft during pushback, belt-loader conveyors, and unknown obstacles all need spatial and temporal reasoning.

The benchmarks below cover BEV occupancy flow, semantic 3D occupancy, dense nuScenes-derived occupancy, multi-camera volumetric occupancy, and camera-only 4D forecasting.

Dataset/Benchmark Table

Dataset / benchmarkSource URLDomain and representationLabels / taskBest useMain transfer risk
Waymo Occupancy Flow / Occupancy Flow Fieldshttps://waymo.com/research/occupancy-flow-fields-for-motion-forecasting-in-autonomous-driving/Spatio-temporal BEV grid with occupancy probabilities and 2D flow vectorsPredict occupied grid cells, flow, and agent identity recovery for observed and speculative agentsMotion forecasting when trajectories are too object-centric; BEV occupancy/flow planning interfaceAgent-centric road forecasting does not cover aircraft/GSE geometry, 3D height, or small FOD
Waymo Occupancy and Flow Prediction Challengehttps://waymo.com/open/challenges/Waymo Open Dataset challenge trackBEV roadway occupancy and motion flow for observed and occluded vehiclesLeaderboard benchmarking and reproducible challenge protocolVehicle-focused roadway domain, not apron equipment or 3D semantic occupancy
Occ3Dhttps://github.com/Tsinghua-MARS-Lab/Occ3DOcc3D-Waymo and Occ3D-nuScenes semantic 3D voxel occupancyLabel generation from accumulated LiDAR and human annotations; evaluation uses camera-visible masks and mIoUStandard semantic occupancy prediction benchmark over nuScenes/WaymoCurrent-frame semantic occupancy, not full future occupancy flow; road classes differ from airside
OpenOccupancyhttps://github.com/JeffWang987/OpenOccupancynuScenes-Occupancy dense semantic occupancy benchmarkDense semantic occupancy annotations and CONet baselineHigh-resolution surrounding occupancy perception and camera/LiDAR fusion evaluationnuScenes urban driving does not capture airport-apron object taxonomy or operational map layers
SurroundOcchttps://github.com/weiyithu/SurroundOccMulti-camera 3D occupancy prediction with dense labels generated from sparse LiDAR and semantic/detection labelsVolumetric occupancy prediction and dense occupancy ground-truth generation pipelineMulti-camera occupancy lifting, dense label generation, and private-data adaptationCamera-only/multi-camera road scenes may fail under apron glare, night lighting, reflective fuselage, and weather
Cam4DOcchttps://github.com/haomo-ai/Cam4DOccCamera-only 4D occupancy forecasting based on nuScenes, nuScenes-Occupancy, and Lyft-Level5Present and future occupancy, 3D backward centripetal flow, 3 observation frames, 4 future frames, 0.2 m voxels in current releaseFuture occupancy forecasting benchmark and baseline for 4D occupancy researchCamera-only forecasting is difficult for occluded apron hazards and small/low objects; labels remain road-data-derived

Metrics

MetricApplies toWhat to report
Occupancy IoU / semantic mIoUOcc3D, OpenOccupancy, SurroundOcc, Cam4DOccPer-class IoU, mean IoU, free/occupied confusion, and mask policy for unknown/unobserved voxels
Future occupancy IoUCam4DOcc and airside 4D occupancyIoU at each future horizon, not only averaged over time
Flow endpoint errorWaymo occupancy flow, Cam4DOcc flow, voxel-flow modelsPer-cell or per-voxel motion vector error, split by actor class and speed
Occupancy-flow consistencyOccupancy flow fieldsWhether predicted flow transports occupied cells into future occupied cells
Observed vs occluded occupancyWaymo-style predictionSeparate metrics for currently visible actors and speculative or occluded future occupancy
Calibration under uncertaintySafety validationReliability diagrams, expected calibration error, and false-free-space rate for occupied/unknown voxels
Runtime and memoryDeploymentMean/P95/P99 latency, voxel resolution, active voxel count, memory, and target hardware

Airside/Indoor/Outdoor Transfer

Transfer pathUseful signalAirside gap
Road occupancy to apron occupancyVoxel/grid representations, mIoU evaluation, unknown-space masking, camera-visible masksAircraft, belt loaders, tugs, dollies, cones, chocks, ground crew postures, reflective markings, and FOD
BEV occupancy flow to airside planningGrid-cell motion scoring and speculative occluded actors3D height, wing/tail sweep, pushback articulation, and small debris are not captured by 2D BEV alone
Camera-only 4D forecasting to airsideTemporal occupancy forecasting and future horizon reportingNight glare, backlight, jet bridge shadows, rain/fog, and camera occlusion require LiDAR/radar fallback
Dense semantic occupancy to map operationsStatic/dynamic spatial layers and map QA visualizationProduction maps need persistence, provenance, and versioning beyond per-frame occupancy

Validation Guidance

  1. Start with Occ3D/OpenOccupancy/SurroundOcc to validate current-frame occupancy before adding future flow.
  2. Evaluate future horizons separately, for example 0.5 s, 1 s, 2 s, and 5 s. Airside speeds are low, but interactions can last longer than road cut-ins.
  3. Report false-free-space rate. For safety, a voxel predicted free when it is occupied is more dangerous than an unknown voxel.
  4. Add airside semantic classes before production claims: aircraft, tug, belt loader, baggage cart, dolly, cone, chock, barrier, personnel, FOD, jet bridge, and stand marking.
  5. Validate camera-only and LiDAR-first variants separately under night, wet apron, glare, fog, and de-icing mist.
  6. Couple occupancy-flow metrics to planner outcomes: collision margin, hard-brake rate, stop/yield correctness, and deadlock behavior near stands.

Sources

Public research notes collected from public sources.