Skip to content

Mahalanobis Distance, Chi-Square Gates, NIS, and NEES

Mahalanobis Distance, Chi-Square Gates, NIS, and NEES curated visual

Visual: innovation covariance ellipse with Mahalanobis distance, chi-square gate, NIS/NEES consistency, accepted/rejected measurements, and threshold tuning.

Mahalanobis distance is the distance that matters when residual dimensions have different units, scales, and correlations. Chi-square gating turns that distance into a statistically interpretable pass/fail test. In AV systems, this is the foundation for rejecting unlikely sensor associations, monitoring estimator consistency, and detecting covariance tuning errors.

Why it matters for AV, perception, SLAM, and mapping

Perception and localization are full of association decisions:

  • Does this radar return belong to that tracked vehicle?
  • Does this landmark observation match that map landmark?
  • Does this loop closure agree with the current pose graph?
  • Does this GNSS fix look plausible given the inertial prediction?
  • Is the estimator covariance consistent with the errors seen in simulation?

Raw Euclidean distance is not enough. A 0.5 m lateral error may be normal for a far camera detection and catastrophic for RTK localization. A 2 degree heading innovation may be small when stopped but large after a well-constrained turn. Mahalanobis distance normalizes residuals by the uncertainty that should explain them.

Stone Soup's data association tutorials use Mahalanobis measures and distance hypothesizers to gate cluttered detections. Kalman-filter practice uses the same quantity as normalized innovation squared (NIS). In simulation, normalized estimation error squared (NEES) checks whether the state error is statistically consistent with the reported covariance.

First-principles math

Mahalanobis distance

Let a residual be

text
r = z - z_hat

with covariance

text
S = Cov(r)

The squared Mahalanobis distance is

text
d2 = r^T S^-1 r

The Mahalanobis distance is

text
d = sqrt(d2)

If S = sigma^2 I, then

text
d2 = ||r||^2 / sigma^2

So Mahalanobis distance is ordinary Euclidean distance measured in standard deviation units. With correlated covariance, it also rotates and scales the space so that residuals along uncertain directions count less than residuals along precise directions.

Why d2 is chi-square distributed

Assume the residual is Gaussian:

text
r ~ N(0, S)

Let R be a square-root information matrix:

text
R^T R = S^-1

Whiten the residual:

text
e = R r

Then

text
e ~ N(0, I)
d2 = e^T e = sum_j e_j^2

The sum of squares of m independent standard normal variables follows a chi-square distribution with m degrees of freedom:

text
d2 ~ chi2(df = m)

This gives a gate threshold for probability mass p:

text
threshold = chi2_ppf(p, df = m)
accept if d2 <= threshold

Typical upper-tail gates:

Residual dimension95 percent99 percent99.7 percent
13.846.638.81
25.999.2111.83
37.8111.3414.16
49.4913.2816.25
612.5916.8120.79

Do not compare a squared Mahalanobis distance to a "number of sigmas" directly. For a 2D residual, a 95 percent gate is d2 = 5.99, which means d = sqrt(5.99) = 2.45.

Innovation covariance and NIS

For a linear Kalman measurement model

text
z_k = H x_k + v_k,    v_k ~ N(0, R)

with predicted state covariance P_k|k-1, the predicted measurement covariance is

text
S_k = H P_k|k-1 H^T + R

The innovation is

text
nu_k = z_k - H x_hat_k|k-1

The normalized innovation squared is

text
NIS_k = nu_k^T S_k^-1 nu_k

Under the filter assumptions, NIS_k is chi-square distributed with degrees of freedom equal to the measurement dimension. NIS is practical because it does not require ground-truth state; it only requires the measurement, prediction, and innovation covariance.

NEES

In simulation or with high-quality ground truth, define state error

text
err_k = x_k_true - x_hat_k

For manifold states, err_k must be computed in the correct local tangent space, not by subtracting quaternions or poses componentwise.

The normalized estimation error squared is

text
NEES_k = err_k^T P_k^-1 err_k

If the estimator is consistent and P_k is the true error covariance, then

text
NEES_k ~ chi2(df = n_x)

where n_x is the state error dimension. NEES tests the full state covariance; NIS tests measurement-space consistency.

Batch tests over time

For N independent samples with the same residual dimension m,

text
sum_i NIS_i ~ chi2(df = N * m)

The average NIS is

text
ANIS = (1 / N) sum_i NIS_i

Use two-sided bounds:

text
lower = chi2_ppf(alpha / 2, df = N * m) / N
upper = chi2_ppf(1 - alpha / 2, df = N * m) / N

Interpretation:

  • Too high: residuals are larger than covariance predicts. Noise is underestimated, model is wrong, or outliers are entering.
  • Too low: covariance is too large, data have been selected by a gate, or residuals are over-smoothed/correlated.

Implementation notes

  • Compute d2 by solving a linear system, not by explicit matrix inverse:
text
solve S y = r
d2 = r^T y
  • Use Cholesky when S is symmetric positive definite. If Cholesky fails, the innovation covariance is not numerically valid.
  • Gate in the same residual space used by the update. If a camera factor uses bearing-only residuals, gate bearing residuals, not pixel boxes plus bearing.
  • Use the correct degrees of freedom after constraints and dropped dimensions. A range-only update has df = 1; a 2D position update has df = 2.
  • Log accepted and rejected NIS separately. A gate truncates the distribution, so NIS computed only after gating will look artificially consistent.
  • For multi-sensor fusion, monitor NIS per sensor type and per operating domain: rain, night, high speed, urban canyon, airport apron, warehouse, and tunnel.
  • For object tracking, use separate gates for proposal generation and final assignment when clutter is high. A loose proposal gate can preserve recall; later likelihood scoring can choose among candidates.
  • For graph SLAM, apply chi-square gates to candidate associations before graph insertion, but still use robust losses inside the graph because gates are not perfect.

Failure modes and diagnostics

SymptomLikely causeDiagnostic
Many valid detections rejectedS too small or wrong dfPlot gate pass rate by range and sensor mode.
Too many clutter associations acceptedGate too loose or covariance inflatedCompare accepted NIS distribution with chi-square bounds.
NIS high after maneuversMotion model missing acceleration, slip, or delaySegment NIS by curvature, acceleration, and latency.
NIS high only for one sensorBad measurement covariance or calibrationCheck raw residuals in that sensor frame.
NEES high but NIS normalState covariance inconsistent but measurements fitInspect unobservable states, process noise, and correlations.
NEES low everywhereCovariance inflated or truth mismatchCompare RMSE to reported standard deviation.
NIS low after gatingSelection bias from rejecting large innovationsEvaluate pre-gate innovations in replay logs.
Cholesky failure in gateNon-SPD covariance or numerical symmetry lossSymmetrize, check eigenvalues, and trace covariance propagation.

Sources

Public research notes collected from public sources.