Uncertainty-Aware BEV Fusion

What It Covers

BEV fusion systems often output a fused representation without exposing how uncertain the depth, modality alignment, or feature fusion was.
Uncertainty-aware BEV fusion keeps uncertainty close to the representation used by detection, occupancy, and planning.
This page focuses on two recent directions: depth uncertainty for camera BEV lifting and feature-level uncertainty for multimodal fusion.
Representative methods are GaussianLSS and HyperDUM.
It complements the broader Uncertainty Quantification and Calibration page.

Camera BEV methods depend on depth estimation, and depth ambiguity is highest exactly where planning risk grows: far range, occlusion, glare, and weak texture.
Multimodal fusion can look more confident after combining sensors even when the sensors disagree.
Feature-level uncertainty matters because many failures happen before the final detection or occupancy head.
In airside environments, uncertainty should drive speed limits, buffers, teleoperation requests, and data collection.
A fused BEV tensor without uncertainty is difficult to audit after a near miss.

GaussianLSS revisits Lift-Splat-Shoot and models depth uncertainty during camera-to-BEV lifting.
It uses Gaussian splatting-style rasterization so lifted image features carry depth uncertainty into BEV rather than collapsing to a single hard depth.
The result is an uncertainty-aware BEV representation for camera perception.
HyperDUM is a deterministic uncertainty method that estimates feature-level epistemic uncertainty for multimodal fusion.
It uses hyperdimensional computing to avoid the training and inference cost of heavy Bayesian or ensemble approaches.
The common principle is to quantify uncertainty where fusion happens, not only at the final class score.

Input: multi-view camera images and camera calibration for GaussianLSS-style depth lifting.
Input: multimodal features, such as camera, LiDAR, and radar features, for HyperDUM-style fusion uncertainty.
Optional input: sensor-health masks, calibration covariance, modality dropout masks, and weather or visibility signals.
Output: BEV feature map with depth or fusion uncertainty.
Output: uncertainty score or map that can be attached to detections, occupancy cells, or planning corridors.
Monitoring output: per-modality feature uncertainty, depth uncertainty, and fused-feature confidence.

GaussianLSS was accepted at CVPR 2025 under the title "Toward Real-world BEV Perception: Depth Uncertainty Estimation via Gaussian Splatting."
The paper positions GaussianLSS as an uncertainty-aware alternative to standard LSS-style unprojection for BEV perception.
HyperDUM was accepted at CVPR 2025 and targets feature-level epistemic uncertainty in multimodal autonomous-vehicle perception.
HyperDUM emphasizes deployability by avoiding expensive Bayesian approximations while estimating uncertainty at fusion level.
Benchmark interpretation should report accuracy, calibration, uncertainty usefulness under corruption, and runtime overhead.
A method that improves calibration but hides high-uncertainty regions from planning is not deployment-ready.

Depth uncertainty can be underestimated if the depth network is overconfident on out-of-distribution scenes.
Gaussian-style lifting can still smear features along wrong depth if calibration or pose is wrong.
Hyperdimensional uncertainty scores need calibration against actual perception error, not only internal feature novelty.
Feature-level uncertainty can be hard to translate into object-level or voxel-level safety actions.
Planners may ignore uncertainty if it is not represented in the interface contract.
Overly conservative uncertainty thresholds can cause unnecessary stops and teleoperation load.

Strong fit for camera BEV around stands because aircraft, ground markings, wet pavement, and glare create depth ambiguity.
Feature-level multimodal uncertainty is useful when cameras, LiDAR, radar, and maps disagree near aircraft or terminal structures.
Uncertainty maps should increase clearance around wings, engines, jet bridges, chocks, cones, tow bars, hoses, and personnel.
Airside ODDs need slices for night floodlights, wet concrete, rain, fog, de-icing mist, spray, and reflective metal.
Treat high BEV uncertainty inside the planned path as a runtime health signal, not just a model diagnostic.
Use uncertainty-triggered data upload to capture repeated site-specific failures.

Keep uncertainty channels aligned with the same BEV grid or voxel grid used by planning.
Calibrate uncertainty by range, sensor configuration, lighting, weather, and object class.
Add corruption tests for camera blackout, LiDAR sparsity, radar ghosting, and calibration perturbation.
Convert uncertainty into planner actions: larger buffers, slower speed, re-observe, teleop request, or minimum-risk stop.
Log uncertainty before and after fusion so incident review can see whether fusion amplified or reduced risk.
Measure uncertainty usefulness with error-detection AUROC/AuPRC, expected calibration error, and path-corridor false-free rate.