Skip to content

Time Sync Health Monitoring and Degraded Mode

Last updated: 2026-05-09

Why It Matters

Timing faults are system faults. A vehicle can keep publishing LiDAR, camera, radar, IMU, and control messages while their timestamps are no longer mutually valid. Without a typed health model, the autonomy stack may continue fusing fresh-looking data into an inconsistent world model.

Time sync diagnostics should turn raw clock data into operational decisions: mission allowed, continue with restrictions, reject a sensor, stop map updates, or safe stop.

Deployment Contract

Contract itemPractical rule
Health statePublish a vehicle time-sync health state, not only raw ptp4l logs.
Clock graphMonitor every edge: GM to switch, switch to PHC, PHC to system clock, GM to sensor, PPS to timing source.
ThresholdsDefine offset, jitter, path-delay, last-sync-age, holdover, and packet-loss thresholds per autonomy mode.
ProvenanceEvery fused topic records timestamp source and sync state.
Degraded policyFusion, localization, mapping, planning, and recorder behavior change when timing confidence degrades.
Fault memoryStore DTC/freeze-frame evidence for timing faults, including GM identity and affected sensors.
RecoveryRequire stable timing for a configured dwell time before returning to full mission capability.

Health-State Model

StateMeaningMission policy
OKRequired clocks locked, offsets inside budget, root traceable.Full mission allowed.
DEGRADEDTiming is bounded but outside nominal limits or root is in bounded holdover.Continue with ODD/speed limits and inflated timestamp uncertainty.
LIMITEDSome sensors have invalid timing but a fallback perception/localization set remains.Complete safe maneuver or depot movement only.
FAILED_SAFETimebase is invalid for required fusion or control evidence.Safe stop or mission inhibit.
UNKNOWNMonitor stale, missing, or rebooting.Treat as degraded or failed according to safety allocation.
SERVICE_REQUIREDFault cleared but evidence requires inspection, config fix, or recalibration.Mission allowed only if release gate permits.

Signals to Monitor

LayerSignals
GrandmasterIdentity, domain, profile, clock class, time source, GNSS/PPS validity, UTC offset, holdover state.
PTP/gPTP networkPort state, selected master, offset from master, path delay, announce/sync timeout, packet loss, BMCA changes.
Linux PHC/systemPHC offset, system offset, servo state, frequency adjustment, PHC-to-interface mapping, process ownership.
SensorsPTP lock, timestamp mode, last sync age, frame/packet counters, local clock drift, trigger counters.
Fusion topicsAcquisition time, receive time, timestamp source, conversion version, per-topic age and monotonicity.
RecorderRaw timing metadata present, dropped metadata count, replay conversion status.

Useful commands and probes:

bash
ptp4l -m -i eth0 -f vehicle-gptp.cfg
phc2sys -m -s eth0 -c CLOCK_REALTIME -w
pmc -u -b 0 'GET CURRENT_DATA_SET'
pmc -u -b 0 'GET PARENT_DATA_SET'
pmc -u -b 0 'GET TIME_STATUS_NP'
ethtool -T eth0

Degraded-Mode Actions

FaultImmediate actionContinued operation
One camera loses PTPStop metric camera fusion from that camera; keep image for human context if clearly marked.Use remaining cameras/LiDAR/radar if coverage and ODD allow.
One LiDAR free-runsStop using it for localization and map updates after holdover budget.Keep as low-confidence obstacle evidence only if safety case permits.
Radar timestamp invalidDrop Doppler updates that require metric alignment.Use radar presence detections with conservative gating if validated.
GNSS GM holdoverStart time uncertainty growth and mission timer.Continue only inside holdover budget and speed/ODD limits.
GM identity changesMark timebase transition; gate fusion buffers.Resume after offset and sensor relock dwell time.
PHC hardware timestamp unavailableBlock timing-critical Ethernet fusion on that interface.Use non-metric monitoring or depot diagnostics only.
Unknown timing stateTreat affected source as stale.Do not let stale-but-recent messages update localization or maps.

DTC and Freeze-Frame Fields

FieldPurpose
dtc_idStable identifier such as TIME_GM_LOST, TIME_SENSOR_PTP_UNLOCKED, or TIME_PHC_HWTS_DISABLED.
affected_clockGM identity, PHC device, interface, sensor serial, or trigger controller.
first_seen_time / last_seen_timeIncident sequencing and intermittent fault triage.
max_offset_ns / rms_offset_nsSeverity and correlation with perception residuals.
path_delay_nsNetwork asymmetry or switch/path change evidence.
clock_class / time_sourceRoot quality and holdover state.
sync_state_before_afterRecovery and time-step analysis.
software_config_idPTP profile, domain, firmware, and timing calibration version.
mission_effectNo effect, degraded fusion, ODD restriction, safe stop, or mission inhibit.

Validation Hooks

  • Fault injection: kill ptp4l, stop phc2sys, disable hardware timestamping, unplug GNSS antenna, block PTP multicast, and force a sensor into free-run.
  • Threshold tests: inject synthetic offset/jitter into logs or a hardware test bench and confirm state transitions.
  • Recovery tests: restore time source and verify dwell time before returning to OK.
  • Replay tests: confirm degraded-mode decisions can be reproduced from logged timing metadata alone.
  • Cross-modal residual tests: correlate timing health events with LiDAR map residuals, camera reprojection errors, radar static-target velocity, and IMU preintegration residuals.

Sources

Public research notes collected from public sources.