Skip to content

V2X Large-Range And Sequential Datasets

Last updated: 2026-05-09

Large-range and sequential V2X datasets extend cooperative perception from isolated frames at a single intersection to multi-agent, multi-sensor, temporally consistent scenes. V2XScenes, V2XPnP Sequential, UrbanIng-V2X, V2X-Real, and related datasets are especially relevant to airport autonomy because fixed infrastructure can observe around occlusions and beyond the ego vehicle's range.

Related pages: infrastructure cooperative perception, RCP-Bench cooperative corruption robustness, sensor fusion architectures, RCooper, V2X-Radar, CoopTrack


Scope

DatasetDomainMain contribution
V2XScenesLarge-range vehicle-infrastructure collaborative perception under challenging traffic conditionsSequential 3D boxes and tracking IDs across seven roadside layouts and condition-labeled scenes.
V2XPnP SequentialReal-world multi-agent V2X sequential perception and predictionTwo vehicles, two infrastructure agents, all collaboration modes, trajectories, maps, and benchmarks.
UrbanIng-V2XMulti-vehicle, multi-infrastructure data across multiple intersectionsLarger diversity across intersections with vehicle cameras/LiDAR and infrastructure thermal cameras/LiDAR.
V2X-RealReal-world multi-modal V2X cooperative perception2 vehicles + 2 infrastructures, 1.2M annotated 3D boxes, four collaboration modes.
CoopScenesEgo-infrastructure collective perception scenes in Germany10 Hz synchronized data, multi-scene registration, anonymization, and development kit.
V2X-SeqSequential vehicle-infrastructure cooperative perception and forecastingSequential perception, trajectories, maps, and forecasting labels across many intersections.

This page is about dataset fit and validation use. It is not a V2X communications-standard page.


Dataset And Task Definitions

TaskWhat is evaluated
Cooperative 3D object detectionWhether fusing vehicle and infrastructure sensors improves 3D boxes.
Cooperative trackingWhether object IDs remain consistent across frames and sensor handoffs.
Large-range roadside perceptionWhether infrastructure extends range beyond the ego vehicle and through occlusion.
Sequential perceptionWhether models use temporal context rather than independent frames.
Prediction / forecastingWhether trajectories can be forecast from cooperative history and map context.
Collaboration-mode comparisonVehicle-centric, infrastructure-centric, V2V, V2I, I2I, and full V2X modes.

V2XScenes emphasizes large-range, challenging-condition vehicle-infrastructure perception. V2XPnP emphasizes spatio-temporal fusion for perception and prediction. UrbanIng-V2X emphasizes multi-intersection diversity. V2X-Real emphasizes real-world multi-agent breadth across collaboration modes.


Sensors And Labels

DatasetSensorsLabels / metadata
V2XScenesMechanical LiDAR, solid-state LiDAR, blind-repair LiDAR, 4D radar, camerasLarge-range sequential 3D boxes, unique tracking IDs, condition descriptions, detection/tracking benchmarks.
V2XPnP Sequential40K LiDAR frames and 208K camera data from two vehicles and two infrastructure agentsObject trajectories, 136 objects/scene, 10 object types, PCD map, vector map, 100 scenarios, 24 intersections.
UrbanIng-V2X12 vehicle RGB cameras, 2 vehicle LiDARs, 17 infrastructure thermal cameras, 12 infrastructure LiDARsCooperative perception data, HD map/digital twin, OpenCOOD conversion, sequence/frame API.
V2X-RealMulti-view camera and LiDAR streams from two vehicles and two infrastructure nodes1.2M annotated 3D boxes, 10 categories, 33K LiDAR frames, 171K camera data, four sub-datasets.
CoopScenesEgo vehicle six-camera/three-LiDAR setup plus infrastructure towers104 minutes at 10 Hz, 62K frames, synchronized/registered/anonymized scenes.
V2X-SeqVehicle and infrastructure sequential dataSequential perception frames, trajectories, vector maps, traffic lights, forecasting scenarios.

The most useful datasets for airside design are those with fixed infrastructure and sequence-level identity, because aircraft stands and apron roads are dominated by occlusion, long-range approach, and handoff between viewpoints.


Metrics

MetricUse
3D mAP / AP by IoUDetection quality for cooperative boxes.
NDS-style detection metricsBalanced detection score where used by the benchmark.
AMOTA / AMOTP or tracking-metric variantsTracking accuracy and localization over time.
ID switches / fragmentationWhether cooperative handoff breaks object identity.
Range-sliced APMeasures infrastructure value at long range.
Occlusion-sliced APMeasures whether cooperation actually solves blind spots.
Mode deltaDifference between vehicle-only, infrastructure-only, V2V, V2I, I2I, and full fusion.
Latency/bandwidth-aware scoreNeeded when transmitted features or boxes have deployment constraints.

For airside work, add clearance-specific metrics: missed actor inside the swept path, missed actor under aircraft envelope, handoff delay at stand entry, and false-clear in occluded pushback zones.


Failure Modes

  • Calibration errors between fixed infrastructure and vehicle sensors create ghost boxes or shifted tracks.
  • Time synchronization errors are amplified in multi-agent fusion, especially for moving vehicles and pedestrians.
  • Infrastructure-heavy datasets can overstate performance if the deployment cannot install equivalent sensor coverage.
  • Sequential labels may hide dropped-frame or packet-delay behavior unless communication is simulated separately.
  • Large-range detection may improve vehicle recall while still missing low-profile objects close to the ego vehicle.
  • Tracking IDs can break when an object moves between infrastructure fields of view.
  • Public-road V2X classes do not cover aircraft, tugs, belt loaders, dollies, ULDs, cones, chocks, hoses, and FOD.
  • Weather and night coverage varies widely by dataset and should not be assumed from "real-world" alone.

AV, Indoor, Outdoor, And Airside Relevance

EnvironmentFitNotes
Public-road AVStrongThese are built for cooperative driving perception and prediction.
Airport airsideStrong architecture proxyFixed infrastructure, controlled network, and repeated routes make V2X especially practical.
Indoor logisticsModerateMulti-agent and fixed-camera principles transfer, but sensors/maps differ.
Outdoor industrial sitesStrongSimilar intersections, yards, and occluded equipment corridors.
Long-range safety monitoringStrongInfrastructure can observe beyond ego line of sight.

Airports may be easier than public roads in one respect: the operator can decide where to place infrastructure sensors and can mandate network participation. They are harder in another respect: aircraft geometry, stand equipment, GSE, reflective surfaces, and operating rules are outside public-road datasets.


Validation And Data-Engine Use

  1. Use V2X-Real and V2XPnP to compare collaboration modes before designing an airport sensor topology.
  2. Use V2XScenes to study large-range handoff, condition labels, and cooperative tracking under challenging traffic.
  3. Use UrbanIng-V2X and CoopScenes to stress multi-site calibration, synchronization, and OpenCOOD-style data conversion.
  4. Build an airport dataset with fixed stand cameras/LiDAR/radar, vehicle sensors, synchronized clocks, maps, and unique IDs across stands.
  5. Keep vehicle-only, infrastructure-only, and fused outputs in logs; the data engine needs to know which source found each object.
  6. Mine disagreement cases: infrastructure sees an object the vehicle misses, vehicle sees an object infrastructure misses, and fusion suppresses a correct single-agent detection.
  7. Add communication replay with latency, packet drop, stale transforms, and degraded collaborator trust before using V2X outputs in a safety argument.

Sources

Public research notes collected from public sources.