GICP and VGICP for 3D SLAM and LiDAR Localization
Related library pages: Production LiDAR-to-Map Localization and Modern LiDAR SLAM and Odometry Algorithms.
Executive Summary
Generalized ICP (GICP) is the probabilistic registration method that unifies point-to-point and point-to-plane ICP. Instead of treating correspondences as exact points or using only the target normal, GICP assigns local covariance models to both source and target. The residual is weighted by the combined uncertainty of the two local surface distributions. On planar surfaces, this behaves like plane-to-plane registration; in less structured geometry, it smoothly falls back toward point-to-point behavior.
Voxelized GICP (VGICP) keeps the GICP objective but replaces expensive nearest-neighbor and per-point target covariance work with voxelized Gaussian target distributions. Koide et al. reported 30 Hz CPU and 120 Hz GPU processing for 15,000-point clouds in the ICRA 2021 VGICP paper. The open-source fast_gicp implementation provides multi-threaded GICP, multi-threaded VGICP, CUDA VGICP, and CUDA NDT-style registration. small_gicp is a newer compact implementation with cleaner interfaces and strong CPU scalability.
For AV localization, GICP/VGICP is usually the best classical scan-to-map registration family when a dense LiDAR map is available. It is more accurate and physically meaningful than point-to-point ICP, less dependent on hand-chosen target normals than point-to-plane ICP, and faster than classic GICP when voxelized and parallelized. Its main risks are local minima, covariance degeneracy, map mismatch from dynamic objects, and overconfidence in weak geometry.
Math / Objective
For a source point (p_i) and corresponding target point or voxel (q_i), define:
d_i(T) = q_i - T p_iEach source point has covariance (C_i^P), each target point or voxel has covariance (C_i^Q). GICP minimizes the Mahalanobis residual:
min_T sum_i d_i(T)^T [ C_i^Q + R C_i^P R^T ]^-1 d_i(T)where (T = (R,t)). More generally:
M_i(T) = C_i^Q + R C_i^P R^T
r_i(T) = d_i(T)^T M_i(T)^-1 d_i(T)Interpretation:
- If both covariances are isotropic, GICP reduces toward point-to-point ICP.
- If the covariances represent locally planar surfaces with high variance tangent to the surface and low variance normal to it, GICP behaves like plane-to-plane ICP.
- If only target covariance is anisotropic, the behavior approaches point-to-plane ICP.
Covariance construction:
- Estimate local covariance from neighbors around each point.
- Eigendecompose the covariance.
- Regularize eigenvalues so the normal direction is small but nonzero.
- Store the covariance in each point's local frame or global frame.
A typical planar regularization:
C = R_eig diag(epsilon, 1, 1) R_eig^Twhere epsilon is small in the surface normal direction. The exact ordering depends on the eigenvector convention.
VGICP changes target representation. Instead of finding a target point with a per-point covariance, it voxelizes the target/map and aggregates distributions in each voxel:
voxel mean: mu_v = average(q_j in voxel v)
voxel covariance: C_v = aggregate local point covariances and/or positions
residual: d_i = mu_v - T p_i
objective: d_i^T [ C_v + R C_i^P R^T ]^-1 d_iThis removes repeated target neighbor searches and makes residual computation parallel across source points.
Algorithm Pipeline
Classic GICP
Preprocess source and target
- Deskew live scan.
- Remove dynamic or low-quality points.
- Downsample to a resolution that preserves local geometry.
Estimate local covariances
- For each source and target point, collect k-nearest neighbors or voxel neighbors.
- Compute local covariance.
- Regularize eigenvalues to avoid singular matrices.
- Cache target covariances for static maps.
Initialize pose
- Use state estimator prediction, GNSS/RTK, wheel odometry, NDT coarse result, or previous scan-to-map output.
Correspondence assignment
- Transform source points.
- Find nearest target point or compatible neighborhood.
- Reject by max distance, normal/covariance compatibility, and robust residual.
Nonlinear optimization
- Minimize the GICP cost with Gauss-Newton, Levenberg-Marquardt, or BFGS-style optimization.
- Recompute correspondences at each outer iteration.
- Export pose, cost, correspondence count, Hessian, and convergence reason.
VGICP
Build or load target voxel map
- For static scan-to-map localization, precompute voxel means and covariances per map tile.
- Keep multiple resolutions if using coarse-to-fine matching.
- Store voxel occupancy, point count, covariance, normal/planarity, semantic stability, and map version.
Voxelize source scan
- Downsample source points.
- Estimate or approximate source covariances.
- Optional: one representative per voxel for speed, or multiple points for accuracy.
Voxel association
- Transform each source point into the target grid.
- Associate to the target voxel at the transformed coordinate and optionally neighboring voxels.
- Use nearest voxel, multi-voxel averaging, or trilinear interpolation depending on implementation.
Parallel residual evaluation
- Compute Mahalanobis residuals independently for each point.
- Reduce the cost, gradient, and Hessian in parallel.
- Run CPU multi-threading or CUDA kernels.
Quality gate
- Require enough valid voxel associations.
- Require reasonable cost, spatial coverage, and Hessian conditioning.
- Inflate uncertainty or reject if map covariance is weak or dynamic mismatch is high.
Initialization / Convergence
GICP and VGICP remain local optimizers. Their objective is better shaped than plain ICP in many scenes, but wrong correspondences still create wrong minima.
Good initialization practices:
- Use the fused estimator pose as the default initial guess.
- Use multi-resolution schedules: large voxels/coarse thresholds first, smaller voxels/fine thresholds last.
- Use NDT or place recognition for cold start and global relocalization.
- Limit per-cycle correction relative to prediction unless relocalization mode is active.
- For scan-to-map, crop target tiles around the predicted pose to avoid far-away false associations.
Convergence behavior:
| Property | Classic GICP | VGICP |
|---|---|---|
| Objective quality | High | High, with voxel approximation |
| Correspondence cost | High due nearest-neighbor search | Low with voxel lookup |
| Parallelism | Moderate | Excellent |
| Sensitivity to voxel size | N/A or downsampling only | High |
| Static map reuse | Good | Excellent |
| Embedded deployment | CPU feasible but heavier | CPU/GPU production-friendly |
Tuning knobs:
- Source downsampling resolution.
- Target voxel resolution.
- Number of neighboring voxels searched.
- Covariance neighborhood size.
- Covariance regularization epsilon.
- Max correspondence distance.
- Robust kernel.
- Maximum iterations and convergence thresholds.
Common convergence failure modes:
- Initial pose outside the local basin.
- Voxel resolution too coarse, causing lost structure.
- Voxel resolution too fine, causing sparse invalid covariances.
- Dynamic objects dominating correspondences.
- Covariance regularization too aggressive or too weak.
- Map tile crop missing useful structure.
Degeneracy
GICP/VGICP model local surfaces better than point-to-point ICP, but they do not remove fundamental observability limits.
Degenerate cases:
- Flat open apron: ground covariances constrain z/roll/pitch, weakly constrain x/y/yaw.
- Long terminal wall: wall-normal translation is strong, along-wall translation is weak.
- Symmetric gate structures: multiple local minima with similar covariance patterns.
- Sparse distant structure: covariance estimates are noisy and over-smoothed.
- Dynamic aircraft: rich geometry exists, but it may be absent from the reference map.
Diagnostics:
H = final Gauss-Newton Hessian / information matrix
lambda_min / lambda_max -> condition ratio
eigenvectors(lambda small) -> weak pose directions
valid voxel association ratio -> map overlap
normal/covariance entropy -> structural diversity
residual by semantic class -> dynamic-object contamination
residual by LiDAR ID -> extrinsic/time-offset problemsHandling strategy:
- Reject or downweight ground-only solutions for horizontal localization.
- Inflate factor covariance in weak eigen-directions.
- Use robust losses and semantic masks for aircraft, vehicles, pedestrians, and temporary equipment.
- Require map overlap across multiple azimuth sectors and height bands.
- Fuse with wheel odometry, IMU, and RTK/GNSS; do not rely on LiDAR in unobservable directions.
- Record degeneracy state as part of the localization health output.
Important: GICP residuals can look statistically well modeled even when the scene itself is unobservable. The covariance model describes local surface uncertainty, not global pose uniqueness.
Runtime
Classic GICP is heavier than ICP because it needs local covariance estimation and Mahalanobis residuals. VGICP is designed to remove that cost from the hot path.
Runtime comparison:
| Method | Main cost | Typical production implication |
|---|---|---|
| Point-to-point ICP | nearest-neighbor search | Fast but less accurate on surfaces |
| Point-to-plane ICP | nearest-neighbor + normals | Fast if normals are precomputed |
| Classic GICP | nearest-neighbor + covariances + nonlinear solve | Accurate but can be slow |
| VGICP | voxel lookup + parallel Mahalanobis residuals | Best speed/accuracy tradeoff for dense maps |
The VGICP paper reports processing 15,000-point clouds at 30 Hz on CPU and 120 Hz on GPU. The fast_gicp repository reports approximate rates for its implementations: FastGICP around 40 FPS, FastVGICP around 70 FPS, CUDA FastVGICP around 120 FPS, and CUDA NDT around 500 FPS under the authors' benchmark conditions. Treat those as implementation benchmarks, not guaranteed vehicle-runtime numbers.
Runtime factors for AV deployment:
- Map tile size and memory layout.
- Voxel resolution and number of active voxels.
- Number of source points retained after downsampling.
- Whether source covariances are computed online or approximated.
- GPU memory transfer overhead.
- Multi-LiDAR merge versus per-LiDAR matching.
- Number of hypotheses for relocalization.
On an Orin-class platform, VGICP is appropriate for a 10 Hz LiDAR localization loop if buffers are preallocated, map tiles are resident, and point counts are bounded.
AV Role
GICP/VGICP is well suited to production AV localization:
- Primary scan-to-map matcher: dense LiDAR scan against prebuilt voxel/covariance map.
- LiDAR odometry: scan-to-submap registration when map localization is unavailable.
- Map construction: align repeated survey passes with surface-aware residuals.
- Graph factors: provide a pose measurement and information matrix for GTSAM/iSAM2.
- Multi-LiDAR fusion: either merge scans into one source cloud or run per-LiDAR VGICP and fuse factors.
- Health monitoring: expose covariance, overlap, and degeneracy metrics.
VGICP is especially attractive for airside autonomy because the map can be preprocessed offline into a tiled voxel-Gaussian representation. The runtime system then does bounded lookup and optimization instead of expensive target geometry estimation.
Indoor / Outdoor Relevance
| Environment | Relevance | Notes |
|---|---|---|
| Warehouses | Very high | Stable planar racks/walls; dynamic forklifts need filtering |
| Parking garages | Very high | Good surfaces; repeated bays require good initialization |
| Urban roads | Very high | Facades, curbs, poles, and road surfaces fit covariance modeling |
| Highways | Medium | Barriers/signs help; open lanes remain weak |
| Forests | Medium | Tree trunks help; foliage covariances are noisy/dynamic |
| Mines/tunnels | High but degenerate | Good local shape, weak along-axis direction |
| Airports | Very high near structures, medium overall | Excellent around terminals, weak on empty aprons |
| Indoor handheld | High with deskew | Continuous-time handling may matter more than objective choice |
Airside Deployment Notes
Recommended airside VGICP map representation:
Voxel:
mean_xyz
covariance_3x3
inverse_covariance_3x3
point_count
planarity / linearity / scattering
semantic class
stability score from multi-session mapping
last map version / tile IDDeployment considerations:
- Ground management: Keep ground for z/roll/pitch, but cap its contribution so it does not dominate horizontal updates.
- Stability scoring: Static terminal facades and fixed poles should carry higher weight than areas often occupied by aircraft or GSE.
- Dynamic masks: Segment aircraft, buses, baggage carts, belt loaders, people, cones, and temporary barriers when possible.
- Map versioning: VGICP should report which map tile version produced each factor so localization failures can be tied to stale maps.
- Multi-LiDAR calibration: Residual-by-sensor diagnostics are mandatory. One bad extrinsic can bias the combined covariance field.
- GPU determinism: Preallocate CUDA buffers; bound source points; avoid runtime allocation and unbounded neighbor searches.
- Fallbacks: NDT is a useful coarse/fallback method; point-to-plane or KISS-ICP-style odometry can support degraded mode.
Airside zones:
| Zone | Expected VGICP behavior |
|---|---|
| Terminal frontage | Strong 6-DoF constraints from facades, jet bridges, poles, signs |
| Occupied stand | Good live geometry but map mismatch risk from movable aircraft/GSE |
| Empty apron | Weak x/y/yaw; rely on RTK/wheel/IMU and inflate LiDAR covariance |
| Service roads | Usually good if buildings/barriers/signage are visible |
| Taxiway/runway edge | Medium; edge lights/signs help but long open stretches remain weak |
Benchmarks
Benchmark both accuracy and production behavior.
Method benchmarks:
- Compare point-to-point ICP, point-to-plane ICP, GICP, VGICP, and NDT on the same scan pairs.
- Sweep voxel size, covariance neighborhood, robust kernel, and source point count.
- Measure convergence basin by applying controlled perturbations in x/y/z/roll/pitch/yaw.
- Test map mismatch by removing/adding aircraft and GSE clusters.
- Test degeneracy by selecting ground-only and facade-only map crops.
Public datasets:
- KITTI and KITTI-360 for automotive outdoor odometry/localization.
- MulRan for urban repeated structures.
- Newer College for indoor-outdoor variation.
- NCLT for long-term changes.
- Hilti SLAM for aggressive motion and indoor complexity.
Airside private benchmark requirements:
| Metric | Requirement |
|---|---|
| Lateral position error | Track separately from 3D RMSE |
| Yaw error | Critical for docking and lane following |
| Covariance calibration | Error should match reported uncertainty |
| Degeneracy recall | Detector should trigger before large pose drift |
| Runtime p99 | Must fit localization cycle budget |
| Map-change robustness | Dynamic aircraft/GSE should not silently corrupt pose |
| Recovery | Relocalization after GNSS/scan-match dropout |
Useful acceptance pattern:
structured zones: VGICP should dominate estimator corrections
open apron: VGICP should report weak horizontal observability
dynamic stands: VGICP should reject or downweight moving/non-map objectsOpen-Source Implementations
| Implementation | Methods | Notes |
|---|---|---|
PCL GeneralizedIterativeClosestPoint | GICP | Mature C++ baseline |
Open3D RegistrationGeneralizedICP | GICP | Good research/prototyping API |
koide3/fast_gicp | FastGICP, FastVGICP, CUDA VGICP, CUDA NDT | Widely used high-performance implementation |
koide3/small_gicp | GICP, VGICP, ICP, point-to-plane | Newer compact C++/Python library with strong threading |
| GLIM / hdl_graph_slam ecosystem | GICP/VGICP factors | Relevant for graph-based mapping and localization |
| Autoware integrations | Usually NDT primary, GICP variants in ecosystem | Useful production reference patterns |
Production selection notes:
- Use
fast_gicporsmall_gicpfor practical evaluation before writing custom kernels. - Prefer a library that exposes convergence status, Hessian/information, residuals, and correspondence counts.
- Check license compatibility before embedding in product code.
- Keep a deterministic wrapper around all third-party registration calls.
Practical Recommendation
Use VGICP as the primary classical scan-to-map matcher for a dense 3D LiDAR map when GPU or strong multi-core CPU budget is available.
Recommended stack:
- Precompute tiled voxel-Gaussian maps from stable multi-session survey data.
- Run multi-resolution VGICP initialized by the fused estimator pose.
- Use robust kernels and dynamic-object masks.
- Export full quality diagnostics and anisotropic covariance to the factor graph.
- Fall back to NDT for coarse recovery and point-to-plane/KISS-ICP-style odometry for degraded operation.
- In open apron zones, let wheel/IMU/RTK dominate horizontal pose unless VGICP observes vertical structure with good spatial coverage.
Do not deploy GICP/VGICP as a black box that returns only a pose and scalar score. The method is strong enough for production only when covariance, degeneracy, overlap, and map-change diagnostics are first-class outputs.
Sources
- Segal, A., Haehnel, D., and Thrun, S. (2009). "Generalized-ICP." Robotics: Science and Systems. DOI:
10.15607/RSS.2009.V.021. https://www.roboticsproceedings.org/rss05/p21.html - Segal, A., Haehnel, D., and Thrun, S. "Generalized-ICP" PDF. https://www.robots.ox.ac.uk/~avsegal/resources/papers/Generalized_ICP.pdf
- Koide, K., Yokozuka, M., Oishi, S., and Banno, A. (2021). "Voxelized GICP for Fast and Accurate 3D Point Cloud Registration." ICRA. DOI:
10.1109/ICRA48506.2021.9560835. https://staff.aist.go.jp/k.koide/assets/pdf/icra2021_02.pdf fast_gicprepository. https://github.com/koide3/fast_gicpsmall_gicprepository and docs. https://github.com/koide3/small_gicp- PCL
GeneralizedIterativeClosestPointdocumentation. https://pointclouds.org/documentation/classpcl_1_1_generalized_iterative_closest_point.html - Open3D Generalized ICP API. https://open3d.org/docs/latest/cpp_api/_generalized_i_c_p_8h.html
- Koide, K. et al. (2021). "Globally Consistent 3D LiDAR Mapping with GPU-accelerated GICP Matching Cost Factors." arXiv. https://arxiv.org/abs/2109.07073