Skip to content

Model Governance and Release Evidence

Last updated: 2026-05-09

Why It Matters

An autonomy model release is not just a better checkpoint. It is a controlled change to vehicle behavior, data assumptions, safety evidence, runtime compatibility, and rollback posture. The release system must prove which model version is approved, what data and tests support it, where it is allowed to run, and how the fleet can return to the previous safe version.

Use this page for model release evidence. It does not replace OTA controls, software supply-chain evidence, or the safety case; it is the MLOps evidence packet that those systems consume.

Operating Model

  1. Register every deployable model in a model registry before release review. Use immutable model versions and mutable aliases such as candidate, shadow, champion, and rollback for deployment routing.
  2. Attach release metadata to the model version: training run ID, code commit, dataset snapshots, label schema, feature schema, calibration package, runtime container, hardware target, and ODD scope.
  3. Treat release approval as a claims-and-evidence review. The release claim states what improved, what did not regress, which ODD is covered, and which operational risk is being reduced.
  4. Require named approval from the model owner, data owner, runtime owner, safety owner, and fleet operations owner before moving the champion alias.
  5. Move through gates: offline metrics, scenario replay, shadow execution, limited canary, fleet expansion. Each gate either promotes, holds, or rejects the exact model version.
  6. Keep rollback executable. The rollback model must be compatible with the active runtime, map schema, calibration schema, and config bundle.

Evidence Artifacts

ArtifactMinimum contentsOwner
Model registry recordRegistered model, immutable version, aliases, tags, release notesMLOps
Training provenanceRun ID, code commit, dependency lock, training config, random seeds, hardwareModel owner
Dataset manifestIceberg/DVC snapshot IDs, label schema, excluded data, leakage checksData owner
Evaluation reportPrimary metrics, calibration, uncertainty, class slices, airport and weather slicesModel owner
Scenario replay reportRequired scenario suite, new mined scenarios, failures, waiversSafety validation
Shadow-mode reportDisagreement with champion, intervention correlation, latency and resource useFleet operations
Safety case linkClaim IDs supported by this release and evidence IDs attached to each claimSafety owner
Release decision recordApprovers, residual risks, rollout plan, rollback trigger, expiry dateRelease manager

Acceptance Checks

  • The model can be loaded by registry alias and by immutable version.
  • The model version has dataset, code, config, and runtime provenance sufficient to rebuild or explain the release.
  • The evaluation report includes both aggregate metrics and operational slices for airport zone, lighting, weather, vehicle platform, and object class.
  • No critical scenario replay regression is open without an approved safety waiver and an explicit operational mitigation.
  • Shadow-mode evidence covers the same ODD requested for release.
  • The release packet states which previous model version is the rollback target and verifies runtime compatibility.
  • The deployment decision references the relevant safety case claims and technical documentation record.

Failure Modes

Failure modeConsequenceControl
Alias moved without evidenceFleet runs a model that was not reviewedRequire signed release decision before alias mutation
Metric-only approvalModel improves averages while regressing rare safety casesGate on scenario replay and ODD slices
Dataset snapshot missingRelease cannot be reproduced or auditedBlock release unless dataset IDs are immutable
Shadow evidence from a different ODDApproval does not support target deploymentTie evidence to airport, route, weather, and vehicle class
Runtime incompatibilityModel passes offline tests but fails on vehicleValidate TensorRT/ONNX/runtime bundle before canary
Rollback model not executableRecovery depends on a manual hotfixKeep rollback alias and compatible artifact bundle current
Approval expires silentlyOld evidence is reused after data or ODD driftRequire evidence expiry and periodic revalidation
  • 40-runtime-systems/ml-deployment/production-ml-deployment.md
  • 40-runtime-systems/ml-deployment/av-cicd-devops-pipeline.md
  • 50-cloud-fleet/mlops/data-flywheel-airside.md
  • 50-cloud-fleet/ota/software-update-management-system-ops.md
  • 60-safety-validation/safety-case/safety-case-evidence-traceability.md
  • 60-safety-validation/verification-validation/testing-validation-methodology.md
  • 60-safety-validation/standards-certification/eu-ai-act-machinery-compliance-dossier.md

Sources

Public research notes collected from public sources.