Skip to content

Cross-Architecture Knowledge Base Gap Fill 2026-05-09

This report records the 2026-05-09 multi-agent web sweep and second-pass research fill for knowledge-base gaps outside the dedicated perception and SLAM method audits.

It complements the End-to-End AV Knowledge Gap Backlog, Continuous Research Loop, and Perception and SLAM Gap Fill. Its purpose is to turn broad backlog rows into source-backed, atomic-file candidates with enough evidence detail for the next writing wave.

Agent Workflow

StageAgentsOutput
Web gap scouts6Non-duplicate gap candidates across foundations, autonomy, platform, runtime/cloud, safety, operations, and industry intelligence.
Fill-in researchers6Source-backed briefs for six disjoint clusters.
Integration1This synthesis report and promotion queue.

The scout pass treated a topic as a stronger gap when it was absent as an atomic file, present only as a backlog row, or covered only indirectly in broad overview pages.

Highest-Leverage Gaps

PriorityGap ClusterWhy It Matters
P0Safety standards to evidenceISO/TS 5083, ISO 34504/34505, HARA/STPA/SOTIF, PLd/SIL, ML assurance, and FAA AGVS approval need assessor-ready artifact models, not scattered references.
P0Runtime data and model governanceModel releases, data lineage, replay-to-simulation, runtime config, observability, and label budgets must be traceable to one release and one deployed fleet state.
P0Platform integration reliabilityZonal wiring, middleware, certifiable compute, thermal design, data recorders, calibration bays, clocks, and service infrastructure define whether autonomy can operate repeatedly.
P0Foundational assurance methodsReal-time scheduling, rare-event statistics, hazard analysis, and ODD/scenario coverage are reusable first-principles layers that explain the safety argument.
P0Closed-loop autonomy evaluationVLA/VLM, scenario libraries, risk forecasting, real-to-sim, and safety critics need closed-loop evidence before they can support airside decisions.
P0Cross-domain deployment evidenceThe repo needs a neutral deployment ledger and regulatory map to distinguish deployed autonomy from vendor claims.

30-Page Promotion Wave

The follow-up writing wave promoted 30 source-backed atomic pages from this report and the latest scout pass.

TrackNew pages
Safety standards and certificationAirside AGVS Regulatory Approval Playbook, Safety Functions PLd/SIL Validation, ML Assurance Data Governance, ISO/TS 5083 ADS Safety V&V, and IATA AHM 908 Autonomous GSE Standards.
Runtime, data, and release governanceModel Governance Release Evidence, Data Catalog, Lineage, and Quality Ops, Replay Scenario Mining Ops, Edge Runtime Supervision and Config Management, and Active Labeling Budget Ops.
Platform integrationZonal E/E Harness and Connectors, Vehicle Middleware DDS/SOME-IP/Zenoh, Safety-Certified Runtime Compute, Vehicle Thermal Management, and AV Data Recorder and DSSAD Hardware.
Autonomy evaluationSafety-Critical Scenario Libraries, Closed-Loop VLM/VLA Evaluation, Risk Forecasting for Long-Tail Planning, Real-to-Sim Closed-Loop Benchmarks, and Natural-Language Cooperative Autonomy.
Regulation and deployment evidence2024-2026 Autonomy Deployment Index, Cross-Domain Autonomy Regulatory Map, Airside AGVS FAA/CAAS Regulatory Map, US Road ADS Approval and Reporting, and EU ADS Type Approval 2022/1426 and 2026/481.
Company deployment evidenceFox Robotics Production Deployment, Locus Robotics Production Deployment, ISEE Production Deployment, Outrider Tech Stack, and Kalmar Port Automation Tech Stack.

Safety Standards and Certification Evidence

The safety section is already broad, but several high-value standards are not yet represented as first-class evidence packages. The missing layer is a set of practical files that turn standards into artifacts, traceability, and acceptance criteria.

PriorityProposed fileGap FilledEvidence Artifacts
P060-safety-validation/standards-certification/iso-ts-5083-ads-safety-vv.mdADS safety design and V&V anchor for airside autonomy, adapted from road ADS scope.ADS item definition, ODD, safety objectives, safety-by-design argument, V&V matrix, residual-risk owners, field-monitor triggers.
P060-safety-validation/verification-validation/iso-3450x-airside-scenario-evidence.mdScenario categorization and test-case generation for assessor-ready airside evidence.Scenario tag ontology, criticality scoring, OpenSCENARIO assets, test objectives, inputs, expected results, coverage dashboard.
P060-safety-validation/safety-case/airside-av-hara-stpa-sotif-analysis.mdIntegrated airside item definition, HARA, STPA, FMEA, and SOTIF trigger catalogue.Control structure, unsafe control actions, safety goals, constraints, scenario tests, runtime monitors, minimum-risk condition.
P060-safety-validation/standards-certification/safety-functions-pld-sil-validation.mdMachinery-style safety-function evidence for autonomous GSE.PLr/SIL allocation, ISO 13849 blocks, IEC 62061/PFHd calculations, stop-distance tests, proof tests, fault injection.
P060-safety-validation/standards-certification/ml-assurance-data-governance.mdML safety management and data governance release model.AI safety plan, model/data cards, dataset lineage, ODD balance, leakage checks, change-impact analysis, rollback plan, AIMS records.
P060-safety-validation/standards-certification/airside-agvs-regulatory-approval-playbook.mdAirport sponsor and FAA AGVS approval workflow.CONOPS, route map, human-monitor role, stakeholder notification, RF/aeronautical study, emergency procedures, signage, insurance, NOTAM checklist.

Acceptance rule: every safety objective should trace to a design requirement, verification method, validation result, residual-risk owner, runtime monitor, and field trigger. A standard summary alone is not enough.

Sources:

Runtime, Cloud, Data, and MLOps Operations

The current runtime and cloud sections cover telemetry, OTA, fleet SRE, map ops, and data pipelines. The missing layer is a release-evidence operating model that connects data, model, scenario, config, fleet state, and deployed behavior.

PriorityProposed file or updateGap FilledOperating Evidence
P050-cloud-fleet/mlops/model-governance-release-evidence.md or promote 50-cloud-fleet/mlops/model-lifecycle-governance.mdModel release evidence registry.One release ID tying model registry version, dataset snapshot, training run, eval bundle, model card, approval, canary criteria, deployed fleet manifest, rollback trigger, and post-release logs.
P050-cloud-fleet/data-platform/data-catalog-lineage-quality-ops.mdAV data catalog, lineage, and quality operations.Raw logs, extracted streams, labels, scenes, embeddings, splits, synthetic assets, deletion propagation, schema versions, quality gates, replay reproducibility.
P050-cloud-fleet/data-platform/replay-scenario-mining-ops.mdOperational pipeline from field events to replay, simulation, and regression tests.Event mining, OpenLABEL/OpenSCENARIO assets, scenario tags, regression queues, ownership states, simulator compatibility records, acceptance metrics.
P040-runtime-systems/software-operations/edge-runtime-supervision-config-management.mdRuntime config control plane.Signed config bundles, feature flags, schema validation, watchdog policy, local fallback, offline operation, compatibility matrix, last-known-safe rollback.
P150-cloud-fleet/data-platform/active-labeling-budget-ops.md or update selective uploadFleet upload and labeling budget optimization.Selected/not-selected audit trail, bandwidth/storage constraints, uncertainty and diversity scores, label ROI, bias checks.
P150-cloud-fleet/observability/ml-observability-trace-correlation.mdModel-aware trace correlation.Model/config version, route/site/weather, input slices, inference latency, GPU pressure, drift, safety events, operator interventions, incident IDs.
P1Update 50-cloud-fleet/data-platform/synthetic-data-generation.mdSynthetic and world-model data governance.Generator version, prompt/control provenance, real/synthetic mix policy, contamination checks, synthetic-to-real validation, evidence that synthetic data improved release metrics.
P150-cloud-fleet/cloud-operations/finops-capacity-planning.mdGPU and AI FinOps.Cost per training run, replay batch, labeling batch, simulation hour, synthetic generation batch, queue policy, capacity-block decision records, utilization, idle burn.

The highest-value first file is model-governance release evidence, because it can cross-link OTA, SUMS, data lineage, validation, safety case, and deployed fleet manifest content.

Sources:

Platform Hardware and Physical Integration

The platform tree is strong on sensors, compute, drive-by-wire, TSN, power, ruggedization, and diagnostics. The remaining gaps are integration artifacts: how physical wiring, middleware, timing, compute safety, heat, recorders, calibration, and service infrastructure are specified and verified as one vehicle.

PriorityProposed file or updateGap FilledArchitecture Decisions
P020-av-platform/networking-connectivity/zonal-ee-harness-connectors.mdZonal E/E, Single Pair Ethernet, sensor SerDes, connectors, serviceability.Network classes, connector ratings, harness service loops, EMI controls, failure isolation, inspection and replacement evidence.
P040-runtime-systems/middleware/vehicle-middleware-dds-someip-zenoh.mdDDS, SOME/IP, Zenoh, ROS 2 RMW, AUTOSAR bridge policy.Transport, discovery scope, QoS, deadline/liveliness, zero-copy path, serialization, bridge owner, monitor hooks.
P020-av-platform/compute/safety-certified-runtime-compute.mdSafety-certified OS, hypervisor, RTOS/Linux partitioning, safety island.Safe-stop ownership, watchdog/reset tree, mixed-criticality scheduling, lockstep MCU handoff, autonomy-supervised-not-trusted boundary.
P020-av-platform/thermal/vehicle-thermal-management.mdWhole-vehicle thermal budget.Worst-case ambient, solar load, sealed enclosure, GPU saturation, SSD sustained write, battery/charger heat, sensor heaters, derating policy.
P140-runtime-systems/data-logging/av-data-recorder-dssad-hardware.mdAV recorder and DSSAD/EDR evidence hardware.Tamper-evident local retention, secure time, config/model versions, command state, operator state, sensor health, removable media chain of custody.
P120-av-platform/sensors/calibration-bay-fixtures.md and 50-cloud-fleet/operations/fleet-calibration-operations.mdDepot calibration fixtures and fleet calibration workflow.Fixture ID, target geometry, floor conditions, serials, firmware, extrinsics, residuals, thresholds, work order, technician, drift closure.
P1Extend 20-av-platform/networking-connectivity/deterministic-networking-tsn.md or add whole-vehicle-timebase.mdClock-domain and timestamp provenance policy.Grandmaster ID, offset, drift, holdover state, timestamp origin, timestamp uncertainty, replay/DSSAD traceability.
P120-av-platform/power-electrical/autonomous-charging-service-infrastructure.mdAutonomous service bay and depot infrastructure.Charging, cleaning, inspection, data sync, queue orchestration, human exclusion zones, fault recovery, emergency access, airport sponsor approval.

The runtime rule should be simple: safety island or independent safety controller owns the permissive path to motion and safe stop; Linux/GPU autonomy provides proposals and evidence, not final safety authority.

Sources:

Foundational Assurance Methods

The knowledge base has many strong estimation, optimization, geometry, probability, and machine-learning primers. The remaining foundations are not another perception or SLAM method; they are the systems-assurance methods that make runtime and safety arguments defensible.

PriorityProposed fileWhat It Should TeachAirside Evidence Artifact
P010-knowledge-base/systems-engineering/real-time-scheduling-wcet-mixed-criticality.mdTask model (C, T, D, J), RM, deadline-monotonic, EDF, response-time analysis, WCET evidence tiers, mixed-criticality mode switch.Timing budget for perception, planning, brake command, watchdog, safety monitor, teleop link, and minimum-risk condition.
P010-knowledge-base/probability-statistics/rare-event-statistics-safety-validation.mdWhy zero failures only bounds risk, importance sampling, subset simulation, adaptive rare-event sampling, EVT tail modeling, SPRT gates.Accelerated validation plan for aircraft clearance breach, pedestrian occlusion, FOD, towbar mismatch, stand-entry conflict, low-visibility docking.
P010-knowledge-base/systems-engineering/fault-trees-stpa-hazard-analysis.mdFMEA, FTA, event trees, minimal cut sets, STPA unsafe control actions, control-loop flaws, human/automation interactions.Hazard-to-safety-goal trace that links component failures, unsafe control actions, SOTIF triggers, requirements, tests, and GSN claims.
P010-knowledge-base/systems-engineering/odd-scenario-ontology-coverage.mdODD taxonomy, scenario categories, logical/concrete scenario generation, marginal/pairwise/t-wise/risk-weighted coverage.Scenario coverage dashboard that maps hazards and ODD attributes to replay, simulation, HIL, VIL, and field evidence.

These are 10-knowledge-base pages, so promotion should include the visual/taxonomy pipeline in the same change: curated visual block, SVG asset, and taxonomy assignment.

Sources:

Closed-Loop Autonomy Evaluation

The autonomy stack already covers VLA/VLM reliability, end-to-end driving, planning, world models, and simulation. The current gap is a closed-loop evaluation protocol that ties language/action models, scenario libraries, risk forecasting, real-to-sim, and field-event regression into one release gate.

PriorityProposed updateGap FilledEvaluation Content
P0Update 30-autonomy-stack/vla-vlm/vlm-vla-reliability-benchmarks.mdBench2ADVLM, SafeVL, DSBench, DriveAction.Closed-loop VLM/VLA evaluation, safety-critic scoring, action-level checks, hallucination and prompt-failure slices.
P0Update 30-autonomy-stack/end-to-end-driving/airside-autonomy-benchmark-spec.mdAirside closed-loop VLA/VLM protocol.Static QA, replay QA, CARLA/AWSIM closed loop, real-to-sim from airport logs, physical low-speed cart tests.
P060-safety-validation/verification-validation/safety-critical-scenario-libraries.mdSafety2Drive-style scenario library.Regulatory scenario tags, corruptions, adversarial variants, hazard-to-scenario mapping, scenario-to-validation workflow.
P130-autonomy-stack/planning/risk-forecasting-long-tail.md or update motion predictionRiskNet-style future risk fields.Interaction risk for converging GSE, personnel, wingtip clearance, service-road merges, stand conflicts.
P1Update airside and neural simulation pagesRealEngine and DriveE2E real-to-sim closed-loop benchmark pattern.Field-event mining, scene reconstruction, counterfactual actors/weather/lighting, simulator acceptance metrics.
P1Update 30-autonomy-stack/multi-agent-v2x/v2x-cooperative-planning.mdLangCoop as a bounded side-channel.Natural-language intent summaries only with structured schema, provenance, expiry, contradiction handling, and safety limits.
P1Update 60-safety-validation/safety-case/safety-case-evidence-traceability.mdDriver-Simulator-Critic flywheel.Field event to scenario to critic to fix to regression evidence to release gate.

Release gate: no VLA/VLM update passes unless it improves or preserves closed-loop collision rate, intervention rate, rule violations, false-clear rate, hallucination rate, and scenario-family coverage. Safety critics should remain advisory unless backed by independent geometric and runtime assurance monitors.

Sources:

Cross-Domain Operations, Regulation, and Deployment Evidence

The operations domain now covers airside, warehouse, yard, port, mining, agriculture, construction, trucking, robotaxi, and delivery robots. The remaining gap is cross-domain evidence: a regulatory map and source-dated deployment ledger that let readers compare maturity without relying on vendor narratives.

PriorityProposed fileGap FilledRequired Fields
P080-industry-intel/regulations/cross-domain-autonomy-regulatory-map.mdApproval matrix across airside, warehouse, yard, port, mining, agriculture/construction, and sidewalk robots.Domain, equipment class, site authority, standards, human supervision, reporting duty, approval artifacts, safety-function evidence.
P080-industry-intel/deployments/2024-2026-autonomy-deployment-index.mdNeutral deployment ledger.Operator, site, ODD, vehicle type, scale, safety-driver status, approval basis, metrics, source date, caveats.
P060-safety-validation/standards-certification/airside-agvs-regulatory-approval-playbook.mdAirside deployment approval workflow.Controlled tests, closed movement-area rules, human monitor, RF/aeronautical study, airport sponsor signoff, tenant coordination.
P180-industry-intel/companies/<company>/tech-stack.mdDeployment-proof company profiles.ISEE, Outrider, Fox Robotics, Locus, Kalmar, Komatsu, Caterpillar, John Deere, Serve Robotics, Starship.
P170-operations-domains/cross-domain/mapping-operations/indoor-outdoor-map-ops-playbook.mdCross-domain mapping operations.Site survey, map owner, route approval, geofence release, map-change control, WMS/YMS/TOS/AODB integration.

Transfer lessons:

  • Airside AGVS approval is staged: sponsor risk acceptance, route markings, remote supervision, worker redesign, and regulator coordination matter as much as vehicle autonomy.
  • Yard autonomy is the closest airside analog: mixed traffic, precise coupling/backing, YMS/WMS/TMS integration, factory acceptance, and site acceptance.
  • Warehouse and port autonomy show that repeatable brownfield workflows and control-system integration beat general autonomy claims.
  • Mining autonomy shows the value of site design, dispatch discipline, exclusion zones, maintenance workflows, and support tooling.
  • Sidewalk robots show that public or shared-space autonomy can fail politically before it fails technically; airside programs have analogous tenant, union, insurer, and airport-ops acceptance gates.

Sources:

Promotion Queue

Use this order for the next atomic-file wave:

  1. Create the safety standards and evidence package: ISO/TS 5083, ISO 3450x scenario evidence, airside HARA/STPA/SOTIF, PLd/SIL safety functions, ML assurance governance, and airside AGVS approval.
  2. Create the runtime release-evidence package: model governance release evidence, data catalog/lineage/quality ops, replay-scenario mining ops, and edge runtime supervision/config management.
  3. Create the platform integration package: zonal E/E harness/connectors, vehicle middleware decision guide, safety-certified runtime compute, vehicle thermal management, and DSSAD recorder hardware.
  4. Create the four foundational assurance pages with the 10-knowledge-base visual/taxonomy pipeline.
  5. Update VLA/VLM reliability and airside benchmark specs with closed-loop evaluation, safety-critic, scenario-library, risk-forecasting, and real-to-sim gates.
  6. Add the cross-domain regulatory map and deployment index before expanding individual company profiles, so company pages can inherit a shared evidence schema.

Do Not Over-Promote Yet

Keep these as watchlist or advisory patterns until stronger evidence exists: direct VLA actuation without independent monitors, natural-language V2X as sole authority, synthetic/world-model data gains without contamination and holdout checks, company-scale claims without dated deployment-source ledgers, and standards pages that summarize rules without producing artifact-level acceptance criteria.

Public research notes collected from public sources.