Skip to content

Airside FOD Synthetic And Multimodal Benchmarks

Last updated: 2026-05-09

Foreign object debris (FOD) detection is a high-consequence small-object perception problem. Public benchmarks now include real FOD images, visible/infrared small-target datasets, runway RGB-IR datasets, and synthetic augmentation pipelines, but they remain pre-operational evidence rather than a substitute for site-specific airport validation.

Related pages: FOD and airport apron detection datasets, open-world OOD and anomaly segmentation benchmarks, sensor corruption robustness benchmarks, night operations and thermal fusion, production perception systems


Scope

ResourceTypeBest use
FOD-AReal RGB FOD object-detection datasetBaseline runway/taxiway FOD detection and environmental slicing.
Airport-FOD3SSynthetic FOD insertion and realism pipelineData augmentation, small-object diversity, and sim-to-real ablation.
IVFOD / dual-light FODInfrared-visible FOD imagery from a dual-light camera setupSensor trade study for visible/IR small FOD.
RDD5000Real visible-infrared runway detection datasetRunway-region detection and multimodal alignment stress, not FOD object detection by itself.
FOD-S2RReal/synthetic sim-to-real FOD detection dataset for aircraft fuel-tank imagerySim-to-real methodology reference; weak direct runway transfer.
FAA FOD guidanceOperational definition and program contextDefines what must be managed in airport environments.

This page focuses on benchmark design and data-engine use for airside perception. It does not replace airport FOD management procedures or certified detection-equipment procurement guidance.


Operational Definition

The FAA defines FOD as any object, living or not, located in an inappropriate place in the airport environment that can injure personnel or damage aircraft. That broad definition is important for perception: the target class is not a closed list of "debris categories." It includes ordinary objects in the wrong place, such as tools, metal fragments, loose pavement, straps, baggage pieces, wildlife, broken light hardware, chocks, and plastic wrap.

For autonomy, the perception task should therefore be framed as:

QuestionRequired evidence
Is there an unexpected object on a safety-critical surface?Detection or anomaly localization on runway, taxiway, stand, or vehicle path.
Is it hazardous for the planned operation?Ground-plane position, size/range estimate, persistence, and path intersection.
Is it a known legitimate airside object?Distinguish cones, chocks, belt loaders, dollies, tugs, stairs, and temporary equipment from debris.
Can the system avoid false-clear decisions?Recall and false-clear rate at locked operating thresholds.

Dataset And Task Details

Dataset / benchmarkSensorsLabelsNotes
FOD-ARGB imagery of FOD on runway/taxiway backgroundsBounding boxes, object categories, light-level and weather categoriesPublic baseline dataset for computer-vision FOD detection.
Airport-FOD3SRGB FOD images plus synthesized FOD compositesDetection labels from augmented dataThree-stage framework: scale transformation, seamless blending, and style transfer for realism.
IVFODInfrared and visible-light camera imageryFour FOD categories in the cited paper: screw, nut, key, bottleCaptured on concrete/asphalt surfaces at 5 m, 10 m, and 15 m across time-of-day variation.
RDD5000DJI drone visible camera and infrared cameraRunway salient-object / runway-region style labels5,000 visible/IR image pairs from real airport runway scenes; useful for region-of-interest and multimodal robustness.
FOD-S2RReal and synthetic images in a simulated aircraft fuel tankObject-detection labels for fuel-tank FODSim-to-real benchmark outside open runway/apron geometry.

FOD-A remains the most direct public runway/taxiway FOD dataset. Airport-FOD3S is best treated as an augmentation and data-engine pipeline around scarce real FOD images. RDD5000 is valuable for RGB-IR runway perception and non-perfect alignment, but it is not a replacement for FOD object labels.


Metrics

MetricWhy it matters
mAP / AP by IoUStandard detector comparison.
AP-small / recall by object sizeFOD is often small, low, and low contrast.
Recall at fixed false positives per image or per kilometerOperational teams need a manageable alert rate.
False-clear rateSafety-critical: path reported clear while hazardous FOD remains.
Range-sliced recallDetermines practical detection distance for braking or stopping.
Environmental slicesLight, wet/dry surface, glare, rain, snow, night, thermal contrast.
Modality ablationRGB-only, IR-only, fused, and synthetic-augmented variants.
Track persistenceEnsures a one-frame detection becomes a stable hazard, not flicker.

For airside autonomy, detector mAP should be secondary to a locked-threshold safety report: miss rate by hazard type, false-clear cases, false-alarm burden, and planner handoff quality.


Synthetic Data And Sim-to-Real Use

Airport-FOD3S is useful because real FOD examples are expensive, disruptive, and unevenly distributed. Its realism pipeline explicitly addresses common synthetic-data failures: wrong object scale, obvious paste artifacts, and style mismatch between the inserted object and pavement/weather context.

Recommended uses:

  1. Expand rare FOD categories and long-tail appearances before target-airport collection is complete.
  2. Generate controlled slices for object size, material, shadow, wet pavement, glare, and clutter.
  3. Train with synthetic augmentation, but keep a real-only validation set and a locked real acceptance holdout.
  4. Compare real-only, synthetic-only, and mixed training to prove synthetic data helps target-domain recall.
  5. Reject gains that only improve easy large objects while leaving screws, washers, cable ties, and dark debris unchanged.

Synthetic data is a data-engine lever, not final safety evidence. A high score on augmented FOD-A can still fail on site-specific pavement texture, rubber deposits, painted markings, jet-blast debris patterns, de-icing residue, and apron lighting.


Failure Modes

  • Small metallic or dark objects disappear after resizing, compression, denoising, or aggressive augmentation.
  • Synthetic pasted objects can create shortcut artifacts that detectors learn instead of FOD geometry.
  • Visible and infrared frames may be misaligned; late fusion must tolerate imperfect correspondence.
  • Hot pavement, reflections, standing water, rubber marks, cracks, leaves, and embedded hardware cause false positives.
  • FOD datasets often lack hard negative apron clutter such as chocks, cones, cable covers, wheel stops, locks, straps, and service equipment.
  • Bounding boxes do not prove ground contact, 3D size, or whether the object lies inside the vehicle's swept path.
  • Fuel-tank FOD sim-to-real results do not automatically transfer to runway or apron scenes.
  • Public datasets rarely encode operational consequences such as runway closure threshold, inspection response, or vehicle stop distance.

AV, Indoor, Outdoor, And Airside Relevance

EnvironmentFitNotes
Airport runway/taxiwayStrongFOD-A, Airport-FOD3S, IVFOD, and FAA guidance are directly relevant.
Airport apronModerateNeed more GSE, aircraft-adjacent clutter, jet bridge, stand, and ground-handling labels.
Public-road AVModerateSmall road debris and unknown obstacle lessons transfer, but runway surfaces are simpler than roads.
Indoor industrialModerateFOD-S2R and small-object sim-to-real lessons transfer to inspection workflows.
General outdoor roboticsModerateUseful for small unexpected objects, but environmental and operational constraints differ.

Airside relevance is strongest when these datasets are used together: FOD-A for base detection, IVFOD/RDD5000 for multimodal sensing, Airport-FOD3S for augmentation, and site-specific data for acceptance.


Validation And Data-Engine Use

  1. Build a source matrix that records dataset, sensor, object class, surface, lighting, weather, and whether each sample is real or synthetic.
  2. Keep synthetic and real samples separable through training, validation, and reporting.
  3. Use hard-negative mining from empty runways, painted markings, rubber deposits, puddles, shadows, runway lights, drains, and normal hardware.
  4. Evaluate on original resolution and deployment-like resolution; do not let resize settings hide sub-10 cm failures.
  5. Add 3D localization checks for any AV use: camera-only boxes must become ground-plane hazard polygons or be confirmed by LiDAR/radar.
  6. Maintain a target-airport holdout set with local debris, pavement, aircraft types, lighting, and apron operations.
  7. In safety reports, separate "detected object" from "actionable FOD alert" and from "vehicle stopped or rerouted correctly."

Sources

Public research notes collected from public sources.