Augmenting a Frenet-Frame Trajectory Planner with Learned World Model Cost Functions

Executive Summary

This report details a practical architecture for augmenting a classical Frenet-frame trajectory planner (Werling 2010) with cost functions derived from a learned world model. The core idea: keep the proven, certifiable sampling-based planner that generates candidate trajectories, but add neural cost terms — occupancy collision cost, hazard proximity cost, predicted smoothness — evaluated by running each candidate through a world model. The classical planner remains the safety backbone. The world model provides richer scene understanding that hand-tuned heuristics cannot capture. This document covers the full stack from mathematical foundations through GPU inference engineering, ROS2 integration, safety constraints, and the eventual migration path toward fully learned planning.

1. Frenet Planner Fundamentals

1.1 The Werling (2010) Formulation

The seminal paper "Optimal Trajectory Generation for Dynamic Street Scenarios in a Frenet Frame" (Werling, Ziegler, Kammel, Thrun — ICRA 2010) introduced the dominant sampling-based planning paradigm for structured roads. The key insight: decompose vehicle motion into a longitudinal component along a reference path (arc-length coordinate s) and a lateral component perpendicular to it (offset coordinate d). In this curvilinear Frenet frame, highway-like planning problems become two independent 1D optimization problems.

Coordinate system. Given a reference path (road centerline), any position can be described by (s, d) where s is the distance traveled along the reference and d is the signed perpendicular offset. The Frenet-Serret frame provides local tangent and normal vectors at each point along the reference, enabling smooth coordinate transforms even on curved roads.

Why Frenet works for structured environments. Airport airside roads, taxiways, and service lanes are highly structured — well-defined lanes, known geometry, predictable curvature. This is the ideal operating domain for Frenet-frame planning, as the road structure provides natural reference paths and the operational design domain constrains the lateral offset range.

1.2 Trajectory Generation via Quintic Polynomials

Each candidate trajectory is generated by connecting the current vehicle state to a sampled terminal state using polynomial functions that minimize jerk (the derivative of acceleration).

Lateral trajectory generation. For high-speed mode, lateral motion d(t) is represented as a quintic (5th-order) polynomial:

d(t) = a0 + a1*t + a2*t^2 + a3*t^3 + a4*t^4 + a5*t^5

The six coefficients are determined by six boundary conditions:

Initial: d(0) = d0, d'(0) = d0_dot, d''(0) = d0_ddot
Terminal: d(T) = d1, d'(T) = 0, d''(T) = 0

Terminal velocity and acceleration are set to zero (smooth lane centering). This formulation inherently minimizes the integral of squared jerk, producing comfortable trajectories.

Longitudinal trajectory generation. For velocity-keeping mode, longitudinal motion s(t) uses quartic (4th-order) polynomials:

s(t) = b0 + b1*t + b2*t^2 + b3*t^3 + b4*t^4

With five boundary conditions:

Initial: s(0) = s0, s'(0) = v0, s''(0) = a0
Terminal: s'(T) = v_target, s''(T) = 0

For stopping/merging modes, quintic polynomials are used with an explicit terminal position constraint.

1.3 Candidate Sampling Strategy

The planner generates a combinatorial set of candidates by sampling across three dimensions:

Parameter	Symbol	Typical Range	Step Size
Lateral offset	`d1`	-MAX_ROAD_WIDTH to +MAX_ROAD_WIDTH	0.5–1.0 m
Time horizon	`T`	MIN_T to MAX_T (e.g. 3.0–5.0 s)	0.2–0.5 s
Target velocity	`v_target`	v_desired +/- delta_v	1.0–2.0 m/s

For a typical configuration with 7 lateral samples, 5 time samples, and 5 velocity samples, this yields 175 candidate trajectories per planning cycle. Production systems like TUM's Frenetix generate 400–800+ candidates and evaluate them in under 8 ms using C++ acceleration.

Each lateral trajectory is combined with each longitudinal trajectory, transformed from Frenet back to Cartesian coordinates, and checked for feasibility (curvature limits, acceleration bounds).

1.4 Standard Cost Functions

The total cost for a candidate trajectory combines lateral and longitudinal components:

C_total = K_LAT * C_lat + K_LON * C_lon

Lateral cost:

C_lat = K_J * J_d + K_T * T + K_D * d1^2

Where:

J_d = sum of squared lateral jerk (comfort)
T = time horizon (prefer faster maneuvers)
d1^2 = squared final lateral offset (prefer lane center)

Longitudinal cost (velocity-keeping mode):

C_lon = K_J * J_s + K_T * T + K_V * (v_target - v_final)^2

Where:

J_s = sum of squared longitudinal jerk
(v_target - v_final)^2 = velocity deviation penalty

Typical weight values (from PythonRobotics reference implementation):

Weight	Value	Purpose
K_J	0.1	Jerk penalty
K_T	0.1	Time penalty
K_D	1.0	Lateral offset penalty
K_V (K_S_DOT)	1.0	Velocity deviation
K_LAT	1.0	Lateral-longitudinal balance
K_LON	1.0	Lateral-longitudinal balance

After cost evaluation, candidates are filtered by feasibility (max curvature, max acceleration) and collision-checked against known obstacles. The lowest-cost feasible, collision-free trajectory is selected.

1.5 Path Tracking: Stanley and Pure Pursuit

Once a trajectory is selected, a tracking controller follows it.

Pure Pursuit. A geometric controller that computes steering angle by tracking a look-ahead point on the reference path at a fixed distance ahead of the rear axle. Steering angle delta = arctan(2 * L * sin(alpha) / L_d), where L is wheelbase, alpha is the angle to the look-ahead point, and L_d is the look-ahead distance. Simple, stable at low speeds, but poor high-speed tracking and corner-cutting behavior.

Stanley Controller. Uses the front axle as reference and considers both heading error psi_e and cross-track error e:

delta = psi_e + arctan(k * e / v)

Where k is a gain parameter and v is vehicle speed. Stanley provides tighter tracking with better cross-track error convergence but can produce aggressive steering inputs.

For airside operations (low speeds, 5–25 km/h, tight maneuvers around aircraft), a hybrid approach is practical: Pure Pursuit for straight segments, Stanley for tight curves, with MPC for scenarios requiring precise constraint satisfaction near obstacles.

2. Adding World Model Cost Functions

2.1 Architecture: Classical Planner + Neural Scorer

The core architecture separates trajectory generation from trajectory scoring:

                    ┌─────────────────────────┐
                    │   Frenet Planner Core    │
                    │  (Trajectory Generation) │
                    │                          │
                    │  Sample N candidates     │
                    │  Compute classical costs │
                    │  Filter infeasible       │
                    └──────────┬──────────────┘
                               │
                    N candidates (positions, velocities, accelerations)
                               │
              ┌────────────────┼────────────────┐
              │                │                │
              v                v                v
    ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐
    │  Classical   │  │ World Model │  │   Combined      │
    │  Cost        │  │ Cost        │  │   Selection     │
    │  (jerk, d,   │  │ (occupancy, │  │                 │
    │   velocity)  │  │  hazard,    │  │  C = alpha*C_cl │
    │              │  │  smoothness)│  │    + beta*C_wm  │
    └─────────────┘  └─────────────┘  └─────────────────┘

The world model evaluates each candidate trajectory by:

Rolling out the trajectory through its predicted future scene state
Computing neural cost terms based on predicted occupancy, agent interactions, and scene evolution
Returning bounded cost values that are combined with classical costs

2.2 Occupancy Collision Cost

The world model predicts future occupancy grids — spatial probability maps showing where obstacles, vehicles, and pedestrians will be at each future time step.

Formulation:

C_occ = sum_{t=0}^{T} sum_{cells in ego_footprint(t)} P_occupied(cell, t) * gamma^t

Where:

P_occupied(cell, t) is the predicted occupancy probability at grid cell for time step t
ego_footprint(t) is the set of grid cells covered by the ego vehicle at time t along the candidate trajectory
gamma is a temporal discount factor (e.g., 0.95) — near-term collisions matter more

Implementation detail. The ego vehicle footprint is rasterized onto the occupancy grid at each trajectory waypoint. The predicted occupancy can come from:

3D occupancy networks (OccWorld-style) predicting voxel occupancy evolution
BEV occupancy flow models predicting 2D top-down occupancy + motion vectors
Agent-based prediction converted to occupancy (Gaussian splats from predicted trajectories)

For airside operations, the occupancy model should be trained or fine-tuned on airside-specific actors: aircraft (large, slow-moving), ground support equipment (GSE), fuel trucks, baggage tugs, and pedestrians (ground crew, marshals).

Cost normalization. Raw occupancy collision cost is bounded: C_occ_normalized = min(C_occ / C_occ_max, 1.0). This prevents a single high-confidence prediction from dominating the cost.

2.3 Hazard Proximity Cost

Beyond binary collision detection, the world model provides soft proximity penalties based on predicted distances to hazards.

Formulation:

C_hazard = sum_{t=0}^{T} sum_{agents} max(0, d_safe - d_predicted(agent, t))^2 / d_safe^2

This penalizes trajectories that bring the ego vehicle within a safety margin of predicted agent positions, with quadratic penalty increasing as distance decreases.

Time-to-collision (TTC) component:

C_ttc = sum_{t=0}^{T} max(0, TTC_threshold - TTC(t)) / TTC_threshold

The world model's predicted agent trajectories enable more accurate TTC computation than rule-based constant-velocity assumptions, particularly for:

Vehicles executing turns or lane changes
GSE with erratic stopping patterns
Pedestrians crossing apron areas

Learned hazard proximity extends beyond geometric distance. The world model can learn that certain configurations are dangerous even at seemingly safe distances — e.g., a fuel truck approaching a running engine, or a baggage cart near an aircraft door zone.

2.4 Predicted Smoothness Cost

The world model can evaluate trajectory smoothness in the context of predicted scene evolution, not just kinematic jerk.

Reactive smoothness. A trajectory may be kinematically smooth but require abrupt corrections if the scene evolves unfavorably. The world model scores trajectories by rolling them out against predicted futures and measuring:

C_smooth = sum_{t=1}^{T} ||a_predicted(t) - a_planned(t)||^2

Where a_predicted(t) is the acceleration the ego would need to remain safe given the predicted scene at time t, and a_planned(t) is the planned acceleration. A trajectory that remains feasible without modification scores low; one that would require emergency braking scores high.

Passenger comfort proxy. For airport shuttle/transport vehicles, the world model can learn comfort-relevant features beyond jerk:

Anticipated stop-and-go patterns near gates
Queuing behavior behind other GSE
Smooth deceleration profiles approaching aircraft stands

2.5 Combining Classical and World Model Costs

The final trajectory cost is a weighted combination:

C_total = alpha * C_classical + beta * C_world_model

Where:

C_classical = K_LAT * C_lat + K_LON * C_lon + C_collision_check
C_world_model = w_occ * C_occ + w_hazard * C_hazard + w_smooth * C_smooth

Weight scheduling. During initial deployment, alpha >> beta (classical dominates). As confidence in the world model grows through validation, beta increases. This is a continuous dial, not a binary switch.

Candidate selection with dual scoring:

Compute C_classical for all N candidates
Rank by classical cost; take top-M (e.g., M = N/2) as pre-filtered set
Evaluate world model costs only on top-M candidates (saves GPU inference)
Compute C_total for top-M; select the minimum

This two-stage approach reduces GPU inference load while ensuring that only classically-reasonable trajectories are evaluated by the world model.

3. Practical Implementation

3.1 Batched GPU Inference for N Candidates

The key engineering challenge is evaluating the world model on N candidate trajectories within the planning cycle budget (typically 50–100 ms at 10–20 Hz).

Batching architecture:

python

# Pseudocode for batched world model evaluation
def evaluate_candidates_batch(world_model, scene_state, candidates):
    """
    scene_state: BEV features, agent states [B=1, C, H, W]
    candidates: trajectory waypoints [N, T, 2]  (x, y per timestep)
    """
    # Expand scene state to batch dimension
    scene_batch = scene_state.expand(N, -1, -1, -1)  # [N, C, H, W]

    # Encode trajectory candidates as action embeddings
    action_embeddings = trajectory_encoder(candidates)  # [N, T, D]

    # Single forward pass through world model
    # Predicts future scene conditioned on each candidate trajectory
    predicted_futures = world_model(scene_batch, action_embeddings)  # [N, T, C, H, W]

    # Compute costs from predicted futures
    occ_costs = occupancy_collision_cost(predicted_futures, candidates)  # [N]
    hazard_costs = hazard_proximity_cost(predicted_futures, candidates)  # [N]
    smooth_costs = smoothness_cost(predicted_futures, candidates)       # [N]

    return occ_costs, hazard_costs, smooth_costs

Performance targets. Research by the BEV world model trajectory evaluation paper (2504.01941) demonstrated that evaluating 256 trajectories through a transformer-based world model achieves 18.7 ms total latency on an NVIDIA L20 GPU. This is well within the real-time budget for a 20 Hz planner.

Key optimizations:

Shared scene encoding. The expensive BEV feature extraction is done once, then reused across all N trajectory evaluations. Only the trajectory-conditioned decoding varies per candidate.
TensorRT compilation. Convert PyTorch models to TensorRT for 2–5x inference speedup. Use FP16 precision for further gains with minimal accuracy loss.
CUDA graphs. Capture the inference computation graph once, then replay it each cycle. Eliminates kernel launch overhead for repeated identical graph shapes.
Dynamic batching. If N varies (due to pre-filtering), use padding to maintain fixed batch size for optimal GPU utilization.

3.2 Pre-Filtering Strategy

Not all candidates deserve GPU evaluation. A two-stage filter dramatically reduces computational load:

Stage 1: Classical filter (CPU, < 1 ms)

Discard kinematically infeasible trajectories (curvature > max, acceleration > max)
Discard trajectories with immediate collision (overlap with known static obstacles)
Discard trajectories leaving drivable area

Stage 2: Cheap heuristic filter (CPU, < 1 ms)

Rank remaining candidates by classical cost
Take top-K candidates (e.g., K = 50–100 from 400+ initial samples)
These are the candidates sent to GPU for world model evaluation

Result: From 400+ initial samples, 50–100 reach the GPU. At 18.7 ms for 256 candidates, 50–100 candidates should evaluate in under 10 ms.

3.3 C++/Python Interop Architecture

Production autonomous driving stacks are predominantly C++ (ISO 26262 compliance, real-time guarantees). The world model is trained in Python/PyTorch. Bridging them requires careful engineering.

Option A: TensorRT C++ API (Recommended for production)

┌──────────────────────────────┐
│  C++ Planner Node            │
│                              │
│  1. Generate candidates      │
│  2. Classical cost (C++)     │
│  3. Pre-filter               │
│  4. Fill input tensors       │  ──→  GPU shared memory
│  5. Call TensorRT engine     │  ←──  GPU shared memory
│  6. Read cost tensors        │
│  7. Select best trajectory   │
└──────────────────────────────┘

The world model is exported from PyTorch to ONNX, then compiled to a TensorRT engine. The C++ planner loads the engine and executes inference natively. No Python runtime involvement at inference time.

Option B: Shared memory bridge (Development/prototyping)

┌───────────────────┐     Shared Memory      ┌───────────────────┐
│  C++ Planner      │ ◄──────────────────────►│  Python WM Server │
│                   │   CUDA IPC handles       │                   │
│  candidates →     │   or POSIX shm           │  ← candidates     │
│  costs ←          │                          │  → costs           │
└───────────────────┘                          └───────────────────┘

Using CUDA IPC memory handles, GPU tensors can be shared between processes without copy. The C++ planner writes candidate trajectories to shared GPU memory; the Python process reads them, runs inference, and writes costs back. Latency overhead is ~0.1 ms for the IPC handshake.

Option C: PyBind11 in-process (Research/testing)

cpp

// C++ side
#include <pybind11/embed.h>
py::scoped_interpreter guard{};
py::object wm_module = py::module::import("world_model_scorer");
py::object score_fn = wm_module.attr("score_trajectories");
py::array_t<float> costs = score_fn(candidates_numpy).cast<py::array_t<float>>();

Embeds the Python interpreter in the C++ process. Simple but introduces GIL contention and Python runtime overhead. Suitable for research prototyping only.

3.4 Memory Layout and Zero-Copy Transfers

Trajectory tensor format. Candidates should be stored in a contiguous GPU buffer:

candidates: float32[N, T, 5]  // (x, y, heading, velocity, curvature) per waypoint

Scene state format. BEV features from the perception pipeline:

bev_features: float16[C, H, W]  // e.g., [256, 200, 200] for 100m x 100m at 0.5m resolution

Zero-copy pipeline:

C++ planner allocates candidates tensor in CUDA pinned memory
Fill trajectory waypoints (CPU computation, but memory is GPU-accessible)
cudaMemcpyAsync to device memory (overlaps with other computation)
TensorRT inference on device
Cost results remain in device memory; cudaMemcpyAsync back for final selection

Using DLPack or __cuda_array_interface__ for framework-agnostic zero-copy tensor exchange between C++, Python, PyTorch, and TensorRT.

4. Existing Work: Learned Planning in Practice

4.1 Think2Drive (ECCV 2024)

Think2Drive is the first model-based reinforcement learning method for autonomous driving. It uses a latent world model based on DreamerV3 architecture.

Architecture:

World model: Learns environment transitions in a compact latent space — state transitions, rewards, and termination conditions
Planner: Trained entirely within the world model's latent space ("thinking" in imagination)
Input: 3D occupancy as scene representation

Key results: Achieved expert-level driving in CARLA v2 within 3 days on a single A6000 GPU. First method to achieve 100% route completion on CARLA v2 scenarios.

Relevance to Frenet augmentation: Think2Drive demonstrates that world models can serve as effective neural simulators for planner training. For augmenting a Frenet planner, a similar world model could be trained to predict scene evolution, but used at inference time for trajectory scoring rather than for policy learning. The latent space efficiency (low-dimensional state, parallel tensor computation) is directly applicable to our batched evaluation architecture.

4.2 WorldRFT (AAAI 2026)

WorldRFT combines latent world models with reinforcement fine-tuning (RFT) for end-to-end driving.

Architecture:

Spatial-Aware World Encoder (SWE): Processes multi-view images through ResNet backbone, integrates VGGT (visual geometry grounded transformer) frozen tokens via cross-attention for 3D spatial awareness
Hierarchical Planning Refinement (HPR): Three parallel subtasks — target region localization (probabilistic Laplace distributions), spatial path planning (2m intervals), temporal trajectory prediction (0.5s intervals)
Local-Aware Iterative Refinement: K iterations of deformable-convolution-based feature sampling along projected trajectory points

Reinforcement fine-tuning: Uses Group Relative Policy Optimization (GRPO) with collision-aware rewards (-1 for collision, 0 otherwise). Trajectory Gaussianization recasts deterministic regression as probabilistic modeling with auxiliary variance networks, enabling sampling-based exploration.

Safety results: 83% collision rate reduction on nuScenes (0.30% to 0.05%). The RFT approach specifically improves safety-critical performance beyond what supervised learning alone achieves.

Relevance: The collision-aware reward formulation and GRPO optimization could be adapted for fine-tuning the world model cost function in our architecture. The iterative refinement mechanism (K iterations with deformable attention) is a good pattern for refining trajectory scores based on local scene features.

4.3 DiffusionDrive (CVPR 2025 Highlight)

DiffusionDrive uses a truncated diffusion model for end-to-end trajectory generation.

Architecture:

Anchored Gaussian distribution: Instead of denoising from pure noise, starts from K-means clustered trajectory anchors (20 anchors from training data)
Truncated schedule: Only 50/1000 diffusion steps during training; 2 denoising steps at inference
Cascade transformer decoder: Stacked transformer layers with deformable spatial cross-attention to BEV/perspective features

Performance: 88.1 PDMS on NAVSIM at 45 FPS on an NVIDIA 4090. This is 10x fewer denoising steps than vanilla diffusion policy, with 6x faster inference.

Relevance to Frenet augmentation: DiffusionDrive's anchor-based approach mirrors the Frenet planner's sampling philosophy — both generate diverse candidates and score them. The key insight is that "human drivers adhere to established driving patterns," justifying starting from anchored proposals rather than random noise. A hybrid could use Frenet-generated candidates as anchors for a diffusion-based refinement step.

4.4 NVIDIA GTRS (NAVSIM v2 Challenge Winner, 2025)

Generalized Trajectory Scoring is NVIDIA's winning approach to the NAVSIM v2 Challenge.

Architecture:

Diffusion-based trajectory generator: Produces 100 diverse fine-grained proposals conditioned on BEV features
Vocabulary Generalization Scorer: Transformer decoder trained on super-dense vocabulary (16,384 trajectories), tested on 8,192. Uses trajectory dropout during training for robustness
Sensor-Augmented Scorer: Horizontal rotation perturbations for out-of-domain robustness with EMA teacher self-distillation

Scoring pipeline: Coarse trajectory sets covering broad situations combined with fine-grained trajectories for safety-critical scenarios. A transformer decoder distilled from perception-dependent metrics (safety, comfort, traffic compliance) progressively filters candidates.

Relevance: GTRS directly validates the "generate-then-score" paradigm. The vocabulary generalization technique — training the scorer on more trajectories than it sees at inference — is directly applicable to our Frenet candidates. The scorer learned to generalize across trajectory distributions rather than memorizing specific patterns.

4.5 Comma.ai's World Model Approach (Production, 2025–2026)

Comma.ai shipped the first production driving model trained entirely in a learned world model simulator.

Architecture:

Compressor Model: Based on Stable Diffusion's image VAE, compresses world states to lower-dimensional latent representations
Dynamics Model: Video Diffusion Transformer that models latent space dynamics — predicts future states given history and actions
Plan Head: Added to the dynamics model, predicts the trajectory to take. Trained on human driving data as ground truth

Key innovation — future anchoring: Providing only past states led to offline training failures (model couldn't recover from errors). Solution: anchor the model on future observations, allowing it to "know where the car is going to be" and recover from prediction mistakes.

Production deployment: openpilot 0.11 (2025) is the first real-world robotics agent fully trained in a learned simulation, shipped to consumer vehicles. This replaces hand-coded MPC planners entirely.

Relevance: Comma's approach represents the end state of our migration path. Their journey — from hand-coded MPC to MPC augmented with learned models to fully learned planning — is exactly the trajectory this augmented Frenet architecture enables. The future-anchoring technique addresses a fundamental challenge in world model training that applies to our setting.

4.6 Tesla FSD: MCTS + Neural Scoring (Historical, pre-2024)

Before transitioning to end-to-end learning, Tesla's FSD used a hybrid architecture directly relevant to Frenet augmentation:

Tree search: Monte Carlo Tree Search (MCTS) generates candidate action sequences
Neural network guidance: A trained network provides heuristics to guide the search, assessing which paths are most promising
Trajectory scoring: Combined classical physics-based checks (collision, comfort) with neural network evaluators predicting intervention likelihood and human-likeness

This is precisely the "classical generator + neural scorer" pattern proposed for Frenet augmentation. Tesla's experience validates the approach but also shows its limitations — the hand-off between classical generation and neural scoring creates a bottleneck that eventually motivated their move to end-to-end learning.

4.7 AdaWM: Adaptive World Model Planning (ICLR 2025)

AdaWM addresses a critical practical problem: performance degradation when adapting pretrained world models to new tasks.

Core insight: When pretrained dynamics models and policies are applied to new domains, distribution shift causes two types of mismatch — dynamics model mismatch (model fails to capture new environment) and policy mismatch (pretrained policy becomes suboptimal).

Solution: Quantify which mismatch dominates using Total Variation distance, then selectively update either the policy or the model using LoRA-based low-rank adaptation. Decision criterion: "Update dynamics model if model error >= C1 * policy error - C2" where C1, C2 are theoretically derived constants.

Results: Significantly outperforms baselines on challenging CARLA tasks (roundabouts, dense traffic left turns) with only 1 hour of fine-tuning on a single V100 GPU after 12 hours of pretraining.

Relevance: AdaWM's adaptive fine-tuning approach is directly applicable to deploying world model costs in new airside environments. When transitioning between airports or operating conditions, the LoRA-based adaptation enables rapid domain transfer without full retraining.

4.8 Woven by Toyota: ML Planner Deployment (Production)

Woven Planet (Toyota) deployed a machine-learned planner as the default in their autonomous vehicles in San Francisco and Palo Alto.

Deployment methodology:

Shadow mode validation: The ML planner ran alongside the classical planner, with outputs compared but not actuated
Data-driven evaluation: Built a system that scales with planner performance, learning from safety driver disengagements and notes
Gradual transition: ML planner became default only after extensive validation against classical baseline

This is the most relevant production precedent for our migration path.

5. Safety Architecture

5.1 Handling Wrong World Model Scores

World models will produce incorrect predictions. The safety architecture must assume this and bound the damage.

Failure modes:

Overconfident false safety: World model scores a dangerous trajectory as low-cost (missed obstacle, wrong prediction)
Overconfident false danger: World model scores a safe trajectory as high-cost (hallucinated obstacle, prediction error)
Out-of-distribution collapse: World model produces nonsensical scores on unseen scenarios

Mode 1 (false safety) is safety-critical. Mode 2 causes conservatism. Mode 3 is detectable.

5.2 Bounded Cost Functions

All world model cost functions must be bounded to prevent them from overriding classical safety checks:

C_wm_bounded = clamp(C_wm_raw, 0, C_max)

Hard safety invariants that the world model cannot override:

Minimum following distance (physics-based, not learned)
Maximum speed limits
Drivable area boundaries
Emergency stopping distance at current velocity

These are implemented as hard constraints in the classical planner, evaluated after world model scoring. Any trajectory violating hard constraints is discarded regardless of its world model score.

5.3 Fallback to Traditional Scoring

Watchdog mechanisms for world model reliability:

python

class WorldModelScorer:
    def __init__(self):
        self.inference_timeout_ms = 30  # max time for WM inference
        self.score_variance_threshold = 0.95  # detect collapsed predictions
        self.ood_detector = OODDetector()  # out-of-distribution input detector

    def score_with_fallback(self, candidates, scene_state):
        # Check if scene is in-distribution for the world model
        if self.ood_detector.is_ood(scene_state):
            return None  # signal: use classical costs only

        try:
            costs = self.run_inference_with_timeout(candidates, scene_state)
        except TimeoutError:
            return None  # inference too slow, use classical

        # Check for collapsed predictions (all scores nearly identical)
        if costs.std() < self.score_variance_threshold:
            return None  # model not discriminating, use classical

        # Check for NaN/Inf
        if not torch.isfinite(costs).all():
            return None  # numerical instability, use classical

        return costs

OOD detection approaches:

Embedding distance: Compare current scene embedding to training distribution. If Mahalanobis distance exceeds threshold, flag as OOD
Ensemble disagreement: Run multiple world model heads; if their cost predictions disagree significantly, reduce world model weight
Reconstruction error: If a VAE-based world model cannot reconstruct the current scene well, its predictions are unreliable

5.4 Confidence-Weighted Blending

Rather than binary fallback, use continuous confidence weighting:

C_total = C_classical + confidence * beta * C_world_model

Where confidence is derived from:

World model prediction entropy (lower entropy = higher confidence)
OOD score (in-distribution = higher confidence)
Ensemble agreement (high agreement = higher confidence)
Historical accuracy (running average of prediction quality)

This approach, inspired by HyPlan's confidence-calibrated blending, ensures graceful degradation rather than abrupt mode switches.

5.5 Safety Envelope Verification

Post-scoring verification ensures the selected trajectory satisfies safety constraints:

1. Select trajectory with lowest C_total
2. Verify: is trajectory within drivable area at all timesteps?
3. Verify: does trajectory maintain minimum clearance from static obstacles?
4. Verify: does trajectory satisfy kinematic constraints?
5. Verify: does trajectory have non-negative TTC at all timesteps?
6. If any verification fails: fall back to next-best trajectory
7. If no trajectory passes: execute emergency stop profile

The emergency stop profile is a pre-computed, kinematically-feasible braking trajectory that is always available as the ultimate fallback. This is the classical safety backbone that makes the hybrid architecture certifiable.

6. Migration Path: Augmented Frenet to Fully Learned Planning

6.1 Phase 1: Shadow Mode (Months 1–3)

Objective: Validate world model cost function without affecting vehicle behavior.

┌──────────────────────────────────────────────┐
│                 Planning Node                 │
│                                               │
│  Frenet Planner ──→ Classical Selection ──→ Control
│       │                                       │
│       └──→ World Model Scoring (logged only)  │
│             - Compare WM selection vs actual  │
│             - Log disagreements               │
│             - Measure WM inference latency    │
└──────────────────────────────────────────────┘

Metrics to track:

Agreement rate: How often does the WM's top-ranked trajectory match the classical selection?
Counterfactual safety: When WM disagrees, would the WM's choice have been safer?
Latency distribution: P50, P95, P99 of WM inference time
OOD frequency: How often does the OOD detector trigger?

6.2 Phase 2: Augmented Mode (Months 3–6)

Objective: World model costs influence trajectory selection with conservative weighting.

C_total = C_classical + 0.1 * confidence * C_world_model

Start with beta = 0.1 and increase based on validation metrics. The world model acts as a "tiebreaker" — it only changes the selected trajectory when classical costs are similar between candidates.

Validation criteria for weight increase:

Zero safety-critical disagreements where WM overrode a safer classical choice
WM inference latency P99 < planning cycle budget
OOD rate < 5% of planning cycles
WM-influenced selections show equal or better smoothness in tracking

6.3 Phase 3: Co-Equal Mode (Months 6–12)

Objective: World model and classical costs contribute equally.

C_total = C_classical + 1.0 * confidence * C_world_model

The world model now significantly influences trajectory selection. This requires:

Extensive closed-loop simulation validation (millions of scenarios)
Real-world A/B testing with safety driver oversight
Formal safety analysis of the hybrid cost function

6.4 Phase 4: WM-Primary Mode (Months 12–18)

Objective: World model is the primary scorer; classical costs serve as regularization.

C_total = 0.3 * C_classical + 1.0 * C_world_model

Classical costs prevent the world model from selecting kinematically absurd trajectories but no longer dominate selection.

6.5 Phase 5: Fully Learned Planning (18+ months)

Objective: Replace Frenet sampling with learned trajectory generation.

Options:

DiffusionDrive-style: Diffusion model generates candidates from anchored distributions, scored by the world model
End-to-end: Single model from perception to trajectory, world model used for training (comma.ai approach)
Hybrid generation: Frenet candidates mixed with learned candidates, all scored by unified cost function

Classical planner remains as safety fallback even in the fully-learned phase. It runs in parallel, and if the learned planner's output violates safety constraints, the classical trajectory is substituted.

6.6 Validation Framework

Offline validation (continuous):

nuPlan-style replay on recorded scenarios
Closed-loop simulation in CARLA/SUMO with airside scenarios
Adversarial scenario generation targeting WM failure modes
Regression testing: any model update must not degrade on existing scenario suite

Online validation (deployment):

A/B testing with matched vehicle pairs (one augmented, one classical)
Safety driver intervention rate as primary metric
Passenger comfort surveys (for transport vehicles)
Scenario-specific tracking: roundabout performance, tight-space navigation, aircraft proximity operations

Metrics hierarchy:

Safety: Collision rate, near-miss rate, intervention rate
Compliance: Rule violations, drivable area departures, speed limit adherence
Efficiency: Route completion time, average speed, unnecessary stops
Comfort: Jerk distribution, lateral acceleration, stop smoothness

7. ROS2 Integration Architecture

7.1 Node Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        ROS2 System Architecture                  │
│                                                                  │
│  ┌──────────┐    ┌──────────────┐    ┌───────────────────────┐  │
│  │Perception│───►│ BEV Encoder  │───►│   World Model Node    │  │
│  │  Node    │    │    Node      │    │  (GPU inference)      │  │
│  └──────────┘    └──────────────┘    │                       │  │
│                         │            │  - Maintains scene    │  │
│                         │            │    state buffer       │  │
│                         │            │  - Runs occupancy     │  │
│                         ▼            │    prediction         │  │
│                  ┌──────────────┐    │  - Provides scoring   │  │
│                  │   Planner    │◄──►│    service            │  │
│                  │    Node      │    └───────────────────────┘  │
│                  │              │                                │
│                  │  - Frenet    │    ┌───────────────────────┐  │
│                  │    sampling  │    │   Safety Monitor      │  │
│                  │  - Classical │───►│      Node             │  │
│                  │    + WM cost │    │                       │  │
│                  │  - Selection │    │  - Constraint check   │  │
│                  └──────┬───────┘    │  - Fallback trigger   │  │
│                         │            │  - Emergency stop     │  │
│                         ▼            └───────────────────────┘  │
│                  ┌──────────────┐                                │
│                  │  Controller  │                                │
│                  │    Node      │                                │
│                  │  (Stanley/   │                                │
│                  │   MPC)       │                                │
│                  └──────────────┘                                │
└─────────────────────────────────────────────────────────────────┘

7.2 Communication Patterns

Option A: Service-based scoring (synchronous)

python

# Planner node calls world model as a service
from world_model_msgs.srv import ScoreTrajectories

class PlannerNode(Node):
    def __init__(self):
        self.wm_client = self.create_client(
            ScoreTrajectories, '/world_model/score')
        self.planning_timer = self.create_timer(0.05, self.planning_callback)  # 20 Hz

    def planning_callback(self):
        candidates = self.generate_frenet_candidates()
        classical_costs = self.compute_classical_costs(candidates)

        # Pre-filter
        top_candidates = self.pre_filter(candidates, classical_costs, top_k=50)

        # Call world model service with timeout
        request = ScoreTrajectories.Request()
        request.trajectories = top_candidates
        request.scene_stamp = self.latest_scene_stamp

        future = self.wm_client.call_async(request)
        rclpy.spin_until_future_complete(self, future, timeout_sec=0.03)

        if future.done() and future.result() is not None:
            wm_costs = future.result().costs
            total_costs = self.combine_costs(classical_costs, wm_costs)
        else:
            # Fallback: use classical costs only
            total_costs = classical_costs

        best_trajectory = top_candidates[np.argmin(total_costs)]
        self.publish_trajectory(best_trajectory)

Option B: Topic-based scoring (asynchronous, recommended)

python

class PlannerNode(Node):
    def __init__(self):
        # Publish candidates for WM scoring
        self.candidates_pub = self.create_publisher(
            TrajectoryBundle, '/planner/candidates', 10)

        # Subscribe to WM scores (arrives next cycle)
        self.wm_scores_sub = self.create_subscription(
            CostArray, '/world_model/scores', self.wm_scores_callback, 10)

        self.latest_wm_scores = None  # Scores from previous cycle
        self.planning_timer = self.create_timer(0.05, self.planning_callback)

    def planning_callback(self):
        candidates = self.generate_frenet_candidates()
        classical_costs = self.compute_classical_costs(candidates)

        # Use WM scores from PREVIOUS cycle (1-cycle latency, never blocks)
        if self.latest_wm_scores is not None:
            total_costs = self.combine_with_stale_wm(
                classical_costs, self.latest_wm_scores, staleness_penalty=0.1)
        else:
            total_costs = classical_costs

        best = candidates[np.argmin(total_costs)]
        self.publish_trajectory(best)

        # Publish current candidates for next cycle's WM scoring
        top_candidates = self.pre_filter(candidates, classical_costs, top_k=50)
        self.candidates_pub.publish(top_candidates)

The asynchronous pattern never blocks the planner on world model inference. WM scores are one cycle stale but this is acceptable at 20 Hz (50 ms latency), as the scene changes minimally between cycles.

7.3 Latency Management

Latency budget breakdown (20 Hz planning, 50 ms cycle):

Stage	Budget	Notes
BEV encoding	10–15 ms	Shared with perception
Frenet sampling + classical cost	2–5 ms	C++, 400+ candidates
Pre-filtering	< 1 ms	Sort + threshold
World model inference	10–20 ms	GPU, 50–100 candidates
Cost combination + selection	< 1 ms	CPU
Safety verification	< 1 ms	CPU
Total	25–43 ms	Within 50 ms budget

Latency mitigation strategies:

Pipelined execution. While the current cycle's trajectory is being tracked, the next cycle's WM inference is already running. The planner publishes candidates at t=0, receives scores at t=50ms, uses them for t=50ms selection.
QoS configuration. Use BEST_EFFORT reliability for WM score topics (dropped messages are acceptable; classical costs provide fallback). Use RELIABLE for trajectory output to controller.
Dedicated executor threading. Assign WM inference callbacks to a separate MultiThreadedExecutor with MutuallyExclusiveCallbackGroup to prevent blocking planner callbacks.
CUDA stream prioritization. Use high-priority CUDA streams for the WM inference node, ensuring it preempts lower-priority GPU tasks (visualization, logging).

7.4 Handling Slow or Crashed World Model Node

Lifecycle node management. The world model node implements ROS2 Managed (Lifecycle) Node states:

Unconfigured → Inactive → Active → [Deactivating → Inactive]
                                    [ErrorProcessing → Unconfigured]

Heartbeat-based watchdog:

python

# In Safety Monitor Node
class WMWatchdog:
    def __init__(self):
        self.last_heartbeat = time.time()
        self.heartbeat_timeout = 0.2  # 200ms (4 missed planning cycles)
        self.wm_healthy = True

    def heartbeat_callback(self, msg):
        self.last_heartbeat = time.time()
        self.wm_healthy = True

    def check_health(self):
        if time.time() - self.last_heartbeat > self.heartbeat_timeout:
            self.wm_healthy = False
            self.get_logger().warn("World Model node unresponsive")
            # Trigger lifecycle transition to Inactive
            # Planner automatically uses classical-only costs

Using the ros-safety/software_watchdogs package, the WM node publishes periodic heartbeats. A SimpleWatchdog grants a lease; if the heartbeat violates the lease period, the watchdog transitions to Inactive state.

Recovery strategy:

Timeout (< 200 ms): Planner uses classical costs only for missed cycles. No user-visible impact.
Sustained failure (> 1 s): Safety monitor triggers lifecycle deactivation of WM node. Logs diagnostic data.
Crash recovery: systemd/launch file restarts the WM node. On restart, node enters Unconfigured state, loads TensorRT engine, warms up with dummy inference, then transitions to Active.
Repeated crashes: Safety monitor disables WM node for the remainder of the mission. Vehicle operates on classical planner only.

GPU error handling:

CUDA OOM: Reduce batch size or disable WM until GPU memory is freed
TensorRT engine corruption: Reload engine from disk
GPU hang: Watchdog timer triggers cudaDeviceReset() and engine reload

7.5 Composition and Zero-Copy IPC

For production deployments, using ROS2 node composition with intra-process communication eliminates serialization overhead:

xml

<!-- launch file composing planner and WM in single process -->
<node_container pkg="rclcpp_components" exec="component_container_mt">
  <composable_node pkg="frenet_planner" plugin="FrenetPlannerNode"/>
  <composable_node pkg="world_model_scorer" plugin="WorldModelScorerNode"/>
</node_container>

With composition, message passing between planner and WM scorer uses shared pointers — zero copy, zero serialization. Research shows this reduces CPU usage by an order of magnitude and provides stable, low latency compared to inter-process communication.

For the GPU inference specifically, the composed node shares the CUDA context, enabling direct tensor exchange without CUDA IPC overhead.

8. Airside-Specific Considerations

8.1 Operational Domain Differences

Airport airside operations differ from public roads in ways that affect the Frenet planner and world model:

Factor	Public Road	Airport Airside
Speed range	0–130 km/h	0–25 km/h
Actor types	Cars, trucks, pedestrians	Aircraft, GSE, fuel trucks, personnel
Actor dynamics	Predictable lane-following	Erratic, priority-based, marshaller-directed
Road structure	Lanes, intersections	Taxiways, aprons, service roads, stand areas
Traffic rules	Highway code, signals	ATC instructions, marshalling, right-of-way zones
Failure consequence	Vehicle damage, injury	Aircraft damage ($millions), fuel hazard, injury

8.2 World Model Training Data Requirements

The world model must be trained or fine-tuned on airside-specific data:

Aircraft pushback trajectories: Highly non-standard motion patterns
GSE interaction patterns: Baggage tugs forming trains, fuel trucks with extended hoses, belt loaders with varying reach
Pedestrian behavior: Ground crew crossing aprons, marshalling signals, safety zone awareness
Temporal patterns: Gate turnaround sequences, departure/arrival rushes

8.3 Frenet Reference Path Considerations

Standard Frenet planning assumes a single lane-like reference path. Airside environments require:

Multiple reference paths: Service roads, taxi lanes, stand approach paths
Dynamic reference paths: Routes around parked aircraft change as stands are occupied/vacated
Reference path switching: Transition between service road and stand approach requires smooth reference path blending

9. Implementation Roadmap

9.1 Quick Start: Minimum Viable Augmentation

Week 1–2: Integrate a pre-trained BEV occupancy model (e.g., from OccWorld or similar) as a ROS2 node. Output: future occupancy grids.

Week 3–4: Implement the occupancy collision cost function. For each Frenet candidate, rasterize ego footprint onto predicted occupancy grids and sum collision probabilities.

Week 5–6: Add shadow mode logging. Run augmented planner in parallel with classical planner. Log disagreements.

Week 7–8: Analyze shadow mode data. Tune beta weight. Identify failure cases. Build initial OOD detector.

9.2 Reference Implementations

Component	Recommended Starting Point
Frenet planner	TUM Frenetix (C++/Python, modular cost functions)
BEV encoding	OccWorld or SimpleBEV
Occupancy prediction	OccWorld flow-based prediction
World model	DreamerV3 (Think2Drive-style) for latent world model
Trajectory scoring	BEV world model trajectory evaluation (2504.01941)
ROS2 integration	Autoware planning module architecture
Safety monitoring	ros-safety/software_watchdogs

9.3 Computational Requirements

Component	GPU Memory	Inference Time	Hardware
BEV encoder	2–4 GB	10–15 ms	NVIDIA Orin / L4
World model (50 candidates)	2–4 GB	10–15 ms	Shared GPU
World model (256 candidates)	4–8 GB	15–20 ms	Shared GPU
TensorRT optimized total	4–6 GB	8–12 ms	NVIDIA Orin

For edge deployment on vehicle hardware (NVIDIA Orin), the entire augmented pipeline fits within the 32 GB memory budget and the 50 ms latency budget with room to spare.

10. Key Takeaways

The Frenet planner is the safety backbone. It generates kinematically-feasible, collision-checked candidates using proven polynomial trajectory generation. The world model augments but never replaces this foundation during the transition period.
Batch evaluation makes it practical. Evaluating 50–100 pre-filtered candidates through a world model costs 10–20 ms on modern GPUs. This fits within a 20 Hz planning cycle with margin.
Bounded costs and fallback are non-negotiable. World model scores are clamped, confidence-weighted, and subject to hard safety constraints. If the world model fails, the classical planner operates independently.
The industry is converging on this pattern. Tesla (historical MCTS + neural scoring), NVIDIA GTRS (generate + score), Woven/Toyota (ML planner with classical fallback), and comma.ai (progressive replacement of classical planners) all validate the "classical generation + learned scoring" approach as a practical migration path.
Production deployment requires shadow mode first. Every system that successfully deployed learned planning (Woven, comma.ai) used extensive shadow mode validation with A/B testing before the learned component influenced vehicle behavior.
ROS2 architecture supports this natively. Lifecycle nodes, QoS policies, composition, and watchdog packages provide the infrastructure for safe integration of a potentially-failing GPU inference node into a safety-critical planning pipeline.

References

Foundational Papers

Werling, M., Ziegler, J., Kammel, S., Thrun, S. "Optimal Trajectory Generation for Dynamic Street Scenarios in a Frenet Frame." ICRA 2010.
Werling, M., Kammel, S., Ziegler, J., Groll, L. "Optimal trajectories for time-critical street scenarios using discretized terminal manifolds." IJRR 2012.

World Model Planning

Li, Q. et al. "Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Autonomous Driving." ECCV 2024.
Xue, Z. et al. "WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving." AAAI 2026.
Liao, B. et al. "DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving." CVPR 2025 Highlight.
Zheng, Z. et al. "AdaWM: Adaptive World Model based Planning for Autonomous Driving." ICLR 2025.
Li, Z. et al. "Generalized Trajectory Scoring for End-to-end Multimodal Planning." NAVSIM v2 Challenge Winner, 2025.
Wang, Z. et al. "End-to-End Driving with Online Trajectory Evaluation via BEV World Model." 2025.

Production Systems

comma.ai. "Learning to Drive from a World Model." 2025. https://blog.comma.ai/mlsim
Woven by Toyota. "Deploying a Machine-Learned Planner for Autonomous Vehicles in San Francisco." 2023.
NVIDIA. "Building Autonomous Vehicles That Reason with NVIDIA Alpamayo." 2026.

Hybrid Planning

Navarro, I. et al. "Hybrid Imitation-Learning Motion Planner for Urban Driving." 2024.
Bey, H. et al. "HyPlan: Hybrid Learning-Assisted Planning Under Uncertainty for Safe Autonomous Driving." 2025.

Frenet Planning Implementations

TUM-AVS. "Frenetix: C++-accelerated Frenet Trajectory Planning Handler." https://github.com/TUM-AVS/Frenetix
CommonRoad. "commonroad-reactive-planner." https://github.com/CommonRoad/commonroad-reactive-planner
Sakai, A. "PythonRobotics: Frenet Optimal Trajectory." https://github.com/AtsushiSakai/PythonRobotics

ROS2 and Safety

Profanter, S. et al. "ROS 2-Based Architecture for Autonomous Driving Systems." Sensors 2025.
ros-safety. "software_watchdogs." https://github.com/ros-safety/software_watchdogs
ROS2 Design. "Managed Nodes (Lifecycle)." https://design.ros2.org/articles/node_lifecycle.html

Path Tracking

Hoffmann, G. et al. "Autonomous automobile trajectory tracking for off-road driving." (Stanley controller)
Coulter, R.C. "Implementation of the Pure Pursuit Path Tracking Algorithm." CMU-RI-TR-92-01.

Safety and Validation

Caesar, H. et al. "nuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles." 2021.
NIO / NVIDIA. "Designing an Optimal AI Inference Pipeline for Autonomous Driving." NVIDIA Developer Blog.

SLAM Methods

Methods

Augmenting a Frenet-Frame Trajectory Planner with Learned World Model Cost Functions ​

Executive Summary ​

1. Frenet Planner Fundamentals ​

1.1 The Werling (2010) Formulation ​

1.2 Trajectory Generation via Quintic Polynomials ​

1.3 Candidate Sampling Strategy ​

1.4 Standard Cost Functions ​

1.5 Path Tracking: Stanley and Pure Pursuit ​

2. Adding World Model Cost Functions ​

2.1 Architecture: Classical Planner + Neural Scorer ​

2.2 Occupancy Collision Cost ​

2.3 Hazard Proximity Cost ​

2.4 Predicted Smoothness Cost ​

2.5 Combining Classical and World Model Costs ​

3. Practical Implementation ​

3.1 Batched GPU Inference for N Candidates ​

3.2 Pre-Filtering Strategy ​

3.3 C++/Python Interop Architecture ​

3.4 Memory Layout and Zero-Copy Transfers ​

4. Existing Work: Learned Planning in Practice ​

4.1 Think2Drive (ECCV 2024) ​

4.2 WorldRFT (AAAI 2026) ​

4.3 DiffusionDrive (CVPR 2025 Highlight) ​

4.4 NVIDIA GTRS (NAVSIM v2 Challenge Winner, 2025) ​

4.5 Comma.ai's World Model Approach (Production, 2025–2026) ​

4.6 Tesla FSD: MCTS + Neural Scoring (Historical, pre-2024) ​

4.7 AdaWM: Adaptive World Model Planning (ICLR 2025) ​

4.8 Woven by Toyota: ML Planner Deployment (Production) ​

5. Safety Architecture ​

Augmenting a Frenet-Frame Trajectory Planner with Learned World Model Cost Functions

Executive Summary

1. Frenet Planner Fundamentals

1.1 The Werling (2010) Formulation

1.2 Trajectory Generation via Quintic Polynomials

1.3 Candidate Sampling Strategy

1.4 Standard Cost Functions

1.5 Path Tracking: Stanley and Pure Pursuit

2. Adding World Model Cost Functions

2.1 Architecture: Classical Planner + Neural Scorer

2.2 Occupancy Collision Cost

2.3 Hazard Proximity Cost

2.4 Predicted Smoothness Cost

2.5 Combining Classical and World Model Costs

3. Practical Implementation

3.1 Batched GPU Inference for N Candidates

3.2 Pre-Filtering Strategy

3.3 C++/Python Interop Architecture

3.4 Memory Layout and Zero-Copy Transfers

4. Existing Work: Learned Planning in Practice

4.1 Think2Drive (ECCV 2024)

4.2 WorldRFT (AAAI 2026)

4.3 DiffusionDrive (CVPR 2025 Highlight)

4.4 NVIDIA GTRS (NAVSIM v2 Challenge Winner, 2025)

4.5 Comma.ai's World Model Approach (Production, 2025–2026)

4.6 Tesla FSD: MCTS + Neural Scoring (Historical, pre-2024)

4.7 AdaWM: Adaptive World Model Planning (ICLR 2025)

4.8 Woven by Toyota: ML Planner Deployment (Production)

5. Safety Architecture