GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
- URL: http://arxiv.org/abs/2408.11085v3
- Date: Wed, 05 Feb 2025 10:25:44 GMT
- Title: GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
- Authors: Changkun Liu, Shuai Chen, Yash Bhalgat, Siyan Hu, Ming Cheng, Zirui Wang, Victor Adrian Prisacariu, Tristan Braud,
- Abstract summary: We propose a novel test-time camera pose refinement (CPR) framework, GS-CPR.
This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods.
The 3DGS model renders high-quality synthetic images and depth maps to facilitate the establishment of 2D-3D correspondences.
- Score: 25.780452115246245
- License:
- Abstract: We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement (CPR) framework, GS-CPR. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods. The 3DGS model renders high-quality synthetic images and depth maps to facilitate the establishment of 2D-3D correspondences. GS-CPR obviates the need for training feature extractors or descriptors by operating directly on RGB images, utilizing the 3D foundation model, MASt3R, for precise 2D matching. To improve the robustness of our model in challenging outdoor environments, we incorporate an exposure-adaptive module within the 3DGS framework. Consequently, GS-CPR enables efficient one-shot pose refinement given a single RGB query and a coarse initial pose estimation. Our proposed approach surpasses leading NeRF-based optimization methods in both accuracy and runtime across indoor and outdoor visual localization benchmarks, achieving new state-of-the-art accuracy on two indoor datasets. The project page is available at https://gsloc.active.vision.
Related papers
- GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting [4.465134753953128]
This paper proposes a new method for accurate and robust 6D pose estimation of novel objects, named GS2Pose.
By introducing 3D Gaussian splatting, GS2Pose can utilize the reconstruction results without requiring a high-quality CAD model.
The code for GS2Pose will soon be released on GitHub.
arXiv Detail & Related papers (2024-11-06T10:07:46Z) - No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.
Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z) - GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
3D Gaussian Splatting (3DGS) allows for the compact encoding of both 3D geometry and scene appearance with its spatial features.
We propose distilling dense keypoint descriptors into 3DGS to improve the model's spatial understanding.
Our approach surpasses state-of-the-art Neural Render Pose (NRP) methods, including NeRFMatch and PNeRFLoc.
arXiv Detail & Related papers (2024-09-24T23:18:32Z) - LP-3DGS: Learning to Prune 3D Gaussian Splatting [71.97762528812187]
We propose learning-to-prune 3DGS, where a trainable binary mask is applied to the importance score that can find optimal pruning ratio automatically.
Experiments have shown that LP-3DGS consistently produces a good balance that is both efficient and high quality.
arXiv Detail & Related papers (2024-05-29T05:58:34Z) - GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time [112.32349668385635]
GGRt is a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses.
As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at $ge$ 5 FPS and real-time rendering at $ge$ 100 FPS.
arXiv Detail & Related papers (2024-03-15T09:47:35Z) - Improving Robustness for Joint Optimization of Camera Poses and
Decomposed Low-Rank Tensorial Radiance Fields [26.4340697184666]
We propose an algorithm that allows joint refinement of camera pose and scene geometry represented by decomposed low-rank tensor.
We also propose techniques of smoothed 2D supervision, randomly scaled kernel parameters, and edge-guided loss mask.
arXiv Detail & Related papers (2024-02-20T18:59:02Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Neural Refinement for Absolute Pose Regression with Feature Synthesis [33.2608395824548]
Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images.
In this work, we propose a test-time refinement pipeline that leverages implicit geometric constraints.
We also introduce a novel Neural Feature Synthesizer (NeFeS) model, which encodes 3D geometric features during training and directly renders dense novel view features at test time to refine APR methods.
arXiv Detail & Related papers (2023-03-17T16:10:50Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.