Skip to content

MultiCorrupt

What It Is

  • MultiCorrupt is a multi-modal robustness dataset and benchmark for LiDAR-camera 3D object detection.
  • The full paper title is "MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection."
  • It was presented at IEEE Intelligent Vehicles Symposium 2024.
  • The benchmark is built around corrupted nuScenes data generation.
  • It tests how fusion detectors respond to weather, sensor loss, sampling issues, and alignment errors.
  • The official repository includes generation code, benchmark material, and dataset instructions.

Core Technical Idea

  • Start from clean nuScenes samples.
  • Generate repeatable camera and LiDAR corruptions at three severity levels.
  • Evaluate strong camera-LiDAR fusion detectors under each corruption.
  • Analyze which fusion strategies remain robust to which perturbation types.
  • Cover both sensor content corruption and sensor relationship corruption.
  • Make the generation pipeline open so teams can compile corrupted datasets locally.
  • Use the benchmark to expose brittle dependence on dense LiDAR, complete camera coverage, and precise calibration.

Inputs and Outputs

  • Inputs are clean nuScenes data, selected corruption type, severity level, and random seed where applicable.
  • Camera corruptions modify image data.
  • LiDAR corruptions modify point cloud data and sweeps.
  • Outputs are corrupted nuScenes-format datasets.
  • Benchmark outputs are detection scores such as mAP and nuScenes Detection Score.
  • The dataset supports model-by-model robustness comparisons.
  • It does not output a new perception architecture; it is an evaluation and data-generation resource.

Architecture or Benchmark Protocol

  • MultiCorrupt defines ten distinct corruption types.
  • The repository lists missing camera, motion blur, points reducing, snow, temporal misalignment, spatial misalignment, beams reducing, brightness, dark, and fog.
  • Each corruption is provided at severity levels 1, 2, and 3.
  • Image corruptions can be generated with img_converter.py.
  • LiDAR corruptions can be generated with corresponding point cloud conversion scripts.
  • Snow simulation uses LiDAR snow simulation resources.
  • The repository supports Hugging Face dataset download and local compilation from nuScenes.

Training and Evaluation

  • The paper evaluates five state-of-the-art multi-modal 3D detectors.
  • Models are tested on corrupted validation data to measure robustness by corruption type.
  • The benchmark is intended for evaluation, but the generated corruptions can also be used for robustness training.
  • The analysis compares resistance ability across detector fusion strategies.
  • The protocol highlights that robustness is corruption-specific; a model can be strong for one failure and weak for another.
  • nuScenes format compatibility reduces integration friction for existing mmdetection3d and OpenPCDet workflows.
  • Evaluation should include clean nuScenes scores to avoid rewarding robustness with unacceptable clean accuracy.

Strengths

  • The corruption list is practical and close to common sensor degradation modes.
  • Three severity levels make it useful for capability-curve development.
  • Open generation code is valuable for reproducibility and local adaptation.
  • It covers both spatial and temporal misalignment, which are common silent fusion failures.
  • It is narrower than MSC-Bench but deeper for LiDAR-camera 3D detection.
  • It can be plugged into many existing nuScenes training and evaluation stacks.

Failure Modes

  • It inherits nuScenes road-scene bias and object taxonomy.
  • Simulated weather and sensor faults may not match specific hardware physics.
  • It does not include radar, thermal, or infrastructure sensors.
  • Severity levels are benchmark-defined and may not correspond to physical measurements such as mm/hour rain or lens occlusion percentage.
  • It can underrepresent correlated real failures, for example rain plus glare plus dirty lens plus LiDAR spray.
  • A detector that performs well on MultiCorrupt can still fail on real apron sensor anomalies.
  • Download and dataset compilation require substantial storage and compute.

Airside AV Fit

  • MultiCorrupt is directly useful for first-pass robustness testing of camera-LiDAR airside perception.
  • Missing camera, dark, fog, snow, beam reduction, and temporal misalignment map to apron operations.
  • Points reducing and beams reducing are relevant to spray, dust, de-icing fluid, and partial LiDAR obstruction.
  • Airside adaptation should add aircraft occlusion, reflective metal surfaces, ground markings, cones, chocks, tow bars, and personnel.
  • It should be extended with radar and real airport weather captures for production safety evidence.
  • Severity levels should be calibrated against actual airport sensor logs where possible.

Implementation Notes

  • Use the official repository scripts rather than reimplementing corruptions by hand.
  • Pin the nuScenes version, corruption severity, and random seeds for reproducible regression tests.
  • Store corrupted datasets separately from clean nuScenes to avoid accidental training contamination.
  • Track per-sensor availability and corruption metadata alongside each evaluation sample.
  • Add airside-specific corruption modules only after baseline MultiCorrupt results are reproducible.
  • Use the benchmark as a gate for fusion changes that claim improved robustness.

Sources

Public research notes collected from public sources.