Skip to content

Message Filters Fusion Risk Patterns

Last updated: 2026-05-09

Why It Matters

message_filters is convenient for camera/LiDAR/IMU/GNSS fusion, but it can turn timing assumptions into hidden queues. A synchronizer can create plausible but wrong sensor tuples, hold old data until TF is available, or drop the only scan that explains a localization event. SLAM deployment should treat every filter as an explicit runtime contract: input timestamps, slop, queue depth, drop policy, and replay behavior.

Synchronizer Contract

Filter behaviorContractRisk if unspecified
Exact synchronizationUse only when sensors are hardware synchronized or generated from the same clock tick.Starvation because stamps differ by microseconds.
Approximate synchronizationslop is smaller than the physical fusion tolerance and larger than measured timestamp jitter.Fusion of stale or causally impossible samples.
Queue sizeQueue depth is derived from input rate x maximum tolerated wait time.Old tuples emitted after compute stalls or TF recovery.
Header requirementAll fusion inputs carry valid Header.stamp; headerless inputs are adapted before the synchronizer.Arrival-time synchronization hides transport and executor jitter.
Known offsetsPer-sensor fixed offsets are calibrated and configured through explicit offset handling.Calibration delay is hidden in a large slop value.
Drop visibilityEvery filter exports dropped, processed, queued, and age/skew metrics.The graph appears healthy while fusion is starved.
TF dependencyTF wait queues are budgeted separately from message synchronization queues.Filter emits a matched tuple that is immediately dropped by TF.

Risk Patterns

PatternHow it appearsRuntime control
Slop as a band-aidIncreasing slop makes drops disappear but localization gets softer or inconsistent.Establish max physical skew per sensor pair and fail config above it.
Arrival-time syncsync_arrival_time or callback-time stamps make live tests pass until network load changes.Synchronize on acquisition/validity stamps; use arrival time only for non-physical telemetry.
Headerless dataallow_headerless=true admits messages with current ROS time.Wrap data into stamped messages at source or exclude from physical fusion.
Queue hiding overloadCPU stall builds a queue, then fusion processes old tuples after the vehicle moved.Lifespan/freshness gate before fusion callback; drop tuples older than budget.
Asymmetric sensor ratesA 100 Hz IMU floods a queue while 10 Hz LiDAR controls tuple release.Downsample or preintegrate high-rate streams before multi-topic synchronization.
Replay accelerationAt 5x replay, message filters drop or pair differently than on vehicle.Acceptance replay at 1x and stress replay separately with expected drop envelopes.
TF after syncTuple matched successfully but transform at tuple stamp is missing.Treat message sync and TF readiness as one end-to-end budget with separate metrics.
Mixed time domainsOne sensor uses wall time, another uses /clock.Launch-time stamp-domain validation and sample header sanity checks.

Parameter Guidance

ParameterSizing rule
slopmax(sensor_timestamp_jitter + calibrated_offset_uncertainty + transport_jitter_seen_before_sync), capped by the fusion algorithm's physical tolerance.
queue_sizeAt least ceil(max_wait_sec * highest_input_rate_hz) + burst_margin, but bounded so stale data cannot exceed freshness budget.
queue_offsetUse only for measured fixed offsets; record calibration source and sign convention.
allow_headerlessfalse for SLAM/fusion. If unavoidable, isolate as degraded mode and tag outputs.
sync_arrival_timefalse for physical sensors. Use only for diagnostics where arrival order is the fact being measured.

Fusion Telemetry

MetricDefinitionAlert
sync_tuple_skew_msMax header stamp minus min header stamp inside emitted tuple.Above configured slop or physical tolerance.
sync_tuple_age_msCallback start ROS time minus newest and oldest tuple stamp.Above fusion freshness budget.
sync_drop_count{input,reason}Drops by queue full, out-of-order, stale, no TF, bad stamp.Any sustained growth outside tested envelope.
sync_queue_depth{input}Current queued messages per input.Approaches configured queue size.
sync_emit_rate_hzTuples emitted per second.Below expected fusion output rate.
input_stamp_regression_countHeader stamp decreases on a topic.Any non-replay occurrence is fault.
input_clock_domain_faultStamp is implausibly far from node ROS time.Blocks fusion activation.

Replay Behavior

Message filters are sensitive to bag ordering, clock rate, and /tf warmup. Replay validation should run these cases:

CaseExpected result
1x full-rate replayTuple rate, skew distribution, and drops match vehicle envelope.
Replay with backward seekQueues clear before output resumes.
Replay with missing topicNode enters degraded or error state, not partial silent fusion.
Replay with delayed /tfDrops are attributed to TF readiness, not generic queue full.
Accelerated replayDrop behavior is measured and marked non-acceptance unless it matches 1x.

Acceptance Checks

  • Every physical fusion synchronizer has a documented slop, queue_size, freshness cutoff, and drop policy.
  • allow_headerless=false and sync_arrival_time=false on production SLAM/fusion inputs.
  • Synthetic skew injection above the configured slop produces drops or degraded state, not fused output.
  • CPU-stall fault injection does not emit tuples older than the freshness budget after recovery.
  • Message filter metrics are visible in /diagnostics, /statistics, or the fleet metrics pipeline.
  • Replay at 1x reproduces tuple count and skew histograms within the accepted tolerance.

Sources

Public research notes collected from public sources.