Replay Time Semantics and TF Message Filter Validation

Last updated: 2026-05-09

Purpose

This protocol validates that offline replay, regression tests, and incident reconstruction preserve the same timing semantics that the runtime stack depends on. It focuses on ROS 2 time, /clock, rosbag2/MCAP record and playback metadata, TF cache behavior, tf2_ros::MessageFilter, and message_filters synchronizers. A replay result is not valid release evidence unless message stamps, publish/receive ordering, transform availability, and simulated time behavior are understood and tested.

This protocol supports timestamp shift sweep, sensor dropout, latency, and jitter stress, SLAM integrity under timing errors, and map-localization release gates for timing health.

Time Domains

Domain	Examples	Validation concern
Sensor source time	LiDAR packet/frame stamp, camera exposure time, GNSS/IMU time	Physical measurement time used by fusion
ROS message stamp	`std_msgs/Header.stamp` in sensor and derived messages	Timestamp used by TF, filters, localization, and replay metrics
Record time	Time recorder received/wrote a message	May differ from measurement time under latency or burst queues
Publish time	Time message is republished during replay	Determines consumer callback timing
ROS simulated time	`/clock` time when `use_sim_time` is enabled	Drives timers, TF buffer, and time-aware nodes
Wall/system time	Host system clock when `use_sim_time` is disabled	Can mask replay defects if mixed with ROS time
MCAP log/publish time	MCAP message timing fields and profile conventions	Required for cross-tool replay and indexing

Required Setup

Item	Requirement
Replay manifest	Bag/MCAP ID, storage format, compression, metadata, topic list, QoS overrides, playback options
Candidate stack	Same build, map, calibration, parameters, and topic remappings used in timing validation
Time policy	Explicit `use_sim_time` value per node and expected `/clock` publisher
TF policy	TF buffer cache duration, `/tf_static` durability, frame tree, transform publication rates
Filter policy	Exact sync type, queue size, slop/window, target frames, and drop callbacks for each filter
Reference outputs	Baseline deterministic output hashes or accepted metric tolerance
Instrumentation	Callback timestamps, filter matches/drops, TF lookup failures, message age, queue depth, and `/clock` trace

Validation Cases

ID	Case	Expected behavior
RT-01	Play with `--clock` and all nodes using simulated time	Timers, TF, and filters use replay time; no wall-time leakage
RT-02	Play without `/clock` when nodes require simulated time	Test fails clearly; no release evidence accepted
RT-03	Record with simulated time before valid `/clock` starts	Recorder avoids zero-time ambiguity or run is rejected
RT-04	Pause/resume and rate changes during replay	Nodes do not treat pause as sensor dropout unless policy says so
RT-05	Burst playback after pause	Queues and stale-data filters prevent false current outputs
RT-06	Reverse or backward `/clock` jump in replay	Time-aware components clear, reset, or fail closed deterministically
RT-07	Message order by receive/log time versus source stamp	Metrics identify ordering mode; synchronizers behave as expected
RT-08	Missing or delayed `/tf_static`	Consumers block or fail closed, not infer wrong static transform
RT-09	TF future/past extrapolation	Failures are diagnosable by frame pair, requested time, and cache bounds
RT-10	`message_filters` exact/approx sync window boundary	Match/drop behavior is deterministic and recorded
RT-11	`tf2_ros::MessageFilter` target-frame availability	Messages are released only when required transform is available
RT-12	Multi-bag or split-bag replay	Topic continuity, `/clock`, and TF caches are correct across file boundaries
RT-13	MCAP conversion from rosbag2 or other tools	Header stamps, schema, channel metadata, and timing fields remain consistent

Procedure

Freeze replay manifest, candidate build, map, calibration, parameters, and expected time policy.
Run a baseline replay with clean logs and record output hashes, metric summaries, and monitor traces.
Run each validation case and capture /clock, TF, filter, queue, and callback instrumentation.
For filters, log accepted pairs/sets, dropped messages, source stamps, target frame, and reason for drop.
For TF, log lookup time, source/target frame, latest/oldest available transform, and extrapolation direction.
Compare outputs across repeated runs to confirm deterministic behavior or document approved nondeterminism.
Tag replay artifacts with whether they are valid for release evidence, debugging only, or rejected.

Metrics

Metric	Definition	Use
Replay determinism	Output hash or metric delta across repeated identical replays	Required for regression and release claims
Clock consistency	Difference between ROS time, message stamps, record time, and wall time	Detects mixed time domains
Clock jump count	Forward/backward jumps, zero-time intervals, and discontinuities	Identifies invalid or special-case replay
TF lookup failure rate	Past/future extrapolation, no connection, no static transform, cache miss	Reveals frame/time contract problems
Message filter match rate	Matched messages divided by candidates for exact/approx sync	Measures usable fusion input
Filter drop reason rate	Queue overflow, out of slop, no transform, out-of-cache, stale/future	Makes failure modes actionable
Callback age	Processing time minus message source stamp	Detects stale outputs and burst replay artifacts
Publication burst factor	Replay publish interval versus recorded inter-arrival interval	Explains queue and latency artifacts
`/tf_static` readiness time	Time until all required static transforms are available	Prevents first-message failures from hiding bugs
Evidence validity flag	Release evidence, debugging only, or rejected	Prevents invalid replay from entering release packet

Pass and Block Gates

Gate	Pass condition	Block condition
RT0 time policy	Every replay test declares `use_sim_time`, `/clock`, rate, pause, and ordering mode	Unknown or mixed wall/ROS time policy
RT1 deterministic replay	Repeated replay produces identical outputs or bounded documented tolerance	Nondeterminism affects release metrics
RT2 no wall-time leakage	Time-aware nodes use ROS time under replay when required	Node uses wall time and changes behavior between replay/runtime
RT3 TF fail-closed	Missing/future/past transforms create drops or diagnostics, not wrong-frame output	Consumer extrapolates or substitutes invalid transform silently
RT4 filter observability	Match/drop behavior and reasons are logged for all safety-relevant filters	Fusion input loss cannot be explained
RT5 bag semantics	Header stamps, record/log time, publish time, and MCAP/rosbag metadata are preserved through conversion	Conversion changes time semantics without manifest update
RT6 static transforms	Required static transforms are durable and available before dependent outputs are trusted	Startup messages processed with missing static transform
RT7 invalid runs excluded	Replays with zero time, clock jumps outside test scope, or missing metadata are marked invalid	Invalid replay included in release evidence

Operational Response

Finding	Response
Replay-only timing defect	Fix replay harness before accepting metrics; do not tune stack to invalid replay artifacts
Runtime-equivalent TF/filter defect	Create stack defect and add to timing fault regression suite
Missing timing metadata in bag/MCAP	Quarantine artifact from release evidence; repair only if original raw timing source is available
`/clock` or wall-time mismatch	Enforce launch-time checks and CI replay smoke tests
High filter drop rate under valid replay	Investigate timestamp sync, QoS, queue size, and sensor health thresholds
MCAP conversion inconsistency	Pin converter version and add conversion round-trip check to data pipeline

Evidence Artifacts

Artifact	Contents
Replay manifest	Bag/MCAP IDs, playback options, time policy, QoS, map/calibration/build IDs
Time trace	`/clock`, wall time, message stamps, record/log time, publish intervals
TF report	Frame tree, static transform readiness, lookup failures, cache bounds
Filter report	Sync matches, drops, queue age, slop/window, target frame, drop reasons
Determinism report	Repeated-run hashes, metric deltas, approved tolerances
Evidence validity register	Which replay artifacts are valid for release, debugging only, or rejected

Sources

ROS 2 design, Clock and Time: https://design.ros2.org/articles/clock_and_time.html
ROS 2 Kilted documentation, tf2 MessageFilter: https://docs.ros.org/en/kilted/Tutorials/Intermediate/Tf2/Using-Stamped-Datatypes-With-Tf2-Ros-MessageFilter.html
ROS 2 Kilted documentation, message_filters: https://docs.ros.org/en/kilted/p/message_filters/
ROS 2 Kilted Approximate Time Synchronizer tutorial: https://docs.ros.org/en/kilted/p/message_filters/doc/Tutorials/Approximate-Synchronizer-Python.html
rosbag2 README and command-line options: https://github.com/ros2/rosbag2
rosbag2 storage MCAP plugin: https://github.com/ros-tooling/rosbag2_storage_mcap
MCAP specification: https://mcap.dev/spec
Foxglove MCAP documentation: https://docs.foxglove.dev/docs/visualization/mcap/
Autoware topic state monitor: https://autowarefoundation.github.io/autoware_universe/main/system/autoware_topic_state_monitor/

SLAM Methods

Methods

Replay Time Semantics and TF Message Filter Validation ​

Purpose ​

Time Domains ​

Required Setup ​

Validation Cases ​

Procedure ​

Metrics ​

Pass and Block Gates ​

Operational Response ​

Evidence Artifacts ​

Sources ​