Skip to content

SHIFT Continuous Domain Shift

Last updated: 2026-05-09

SHIFT is a CVPR 2022 synthetic autonomous-driving dataset for continuous multi-task domain adaptation. It is valuable because it exposes gradual changes in weather, illumination, and traffic variables instead of treating robustness as a binary clean-versus-corrupted test.

Related pages: sensor corruption robustness benchmarks, weather robustness datasets, test-time adaptation for airside


Scope

ItemSHIFT coverage
Primary domainSynthetic road driving generated for perception research
Shift variablesCloudiness, rain intensity, fog intensity, time of day, vehicle density, pedestrian density
Shift shapeDiscrete domains and continuous intra-sequence shifts
Main useDomain adaptation, domain generalization, test-time adaptation, multi-task robustness
Release supportPublic benchmark toolkit and devkit

SHIFT should be used when the question is "where does performance start to bend or fall off" rather than "does the model pass in a named weather bucket."


Sensors And Labels

AssetNotes
RGB images/videoDownloadable at image and video frame rates through the devkit
LiDAR point cloudsSynthetic point-cloud modality with calibration support
DepthDense depth supervision for monocular or multi-task learning
Optical flowMotion supervision for temporal perception
2D and 3D boxesObject detection labels through devkit data groups
Semantic segmentationDense semantic masks
Instance segmentationInstance-level masks and IDs
Multi-view supportDevkit supports selecting views, data groups, splits, frame rate, and shift type

The devkit exposes discrete shift data and continuous variants, including multiple continuous sampling rates.


Tasks And Metrics

TaskPractical metric
Semantic segmentationmIoU by shift intensity and task combination
Object detection2D/3D AP by domain variable and severity
Depth estimationAbsRel/RMSE as weather and lighting vary
Optical flowEndpoint error under lighting/weather change
Multi-task adaptationPer-task degradation curves and negative-transfer analysis
Test-time adaptationOnline recovery speed, stability, and catastrophic adaptation failures

The most useful metric is a degradation curve over the continuous variable, not a single aggregate validation score.


Best Use

Use SHIFT to:

  • plot ODD cliff points as rain, fog, darkness, or traffic density increases;
  • compare static training, domain adaptation, and test-time adaptation;
  • test multi-task training tradeoffs between segmentation, depth, flow, and detection;
  • stress temporal perception with slowly changing conditions;
  • create synthetic-to-real hypotheses before spending labeling budget on real airport logs.

SHIFT is especially useful for release-gate design because it encourages thresholds such as "degrade mode starts at this confidence and visibility range" rather than an unstructured adverse-weather bucket.


Airside Transfer

Airside operations also contain continuous shifts: drizzle to heavy rain, dusk to night, low to high apron traffic, dry to wet reflective pavement, and light to heavy fog. SHIFT can prototype:

  • degradation-curve reporting for camera/LiDAR perception;
  • adaptation triggers for airport onboarding;
  • multi-task models that share depth, segmentation, and detection heads;
  • ODD boundary logic that gates autonomous mode before perception collapses.

The transfer is methodological. The imagery, object taxonomy, road layout, and sensor physics are synthetic road-domain data. Final evidence needs airport logs with aircraft, GSE, ramp workers, stand markings, cones, chocks, glycol film, and jet-bridge geometry.


Limitations

  • Synthetic data does not reproduce all camera, LiDAR, radar, lens-contamination, or wet-surface artifacts.
  • Road-domain labels do not include airside classes or airport surface rules.
  • Continuous variables are controlled simulator parameters, not direct measurements such as MOR visibility, rainfall rate, or sensor blockage.
  • Good adaptation on SHIFT can still fail on real sensor noise and operations-driven distribution shifts.
  • Treat results as robustness-screening evidence, not production acceptance evidence.

Sources

Public research notes collected from public sources.