GSVisLoc: Generalizable Visual Localization for Gaussian Splatting Scene Representations
- URL: http://arxiv.org/abs/2508.18242v1
- Date: Mon, 25 Aug 2025 17:31:57 GMT
- Title: GSVisLoc: Generalizable Visual Localization for Gaussian Splatting Scene Representations
- Authors: Fadi Khatib, Dror Moran, Guy Trostianetsky, Yoni Kasten, Meirav Galun, Ronen Basri,
- Abstract summary: We introduce GSVisLoc, a visual localization method designed for 3D Gaussian Splatting (3DGS) scene representations.<n>We accomplish this by robustly matching scene features to image features.<n>Our algorithm proceeds in three steps, starting with coarse matching, then fine matching, and finally by applying pose refinement for an accurate final estimate.
- Score: 20.526639308216755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce GSVisLoc, a visual localization method designed for 3D Gaussian Splatting (3DGS) scene representations. Given a 3DGS model of a scene and a query image, our goal is to estimate the camera's position and orientation. We accomplish this by robustly matching scene features to image features. Scene features are produced by downsampling and encoding the 3D Gaussians while image features are obtained by encoding image patches. Our algorithm proceeds in three steps, starting with coarse matching, then fine matching, and finally by applying pose refinement for an accurate final estimate. Importantly, our method leverages the explicit 3DGS scene representation for visual localization without requiring modifications, retraining, or additional reference images. We evaluate GSVisLoc on both indoor and outdoor scenes, demonstrating competitive localization performance on standard benchmarks while outperforming existing 3DGS-based baselines. Moreover, our approach generalizes effectively to novel scenes without additional training.
Related papers
- DIP-GS: Deep Image Prior For Gaussian Splatting Sparse View Recovery [31.43307762723943]
3D Gaussian Splatting (3DGS) is a leading 3D scene reconstruction method, obtaining high-quality reconstruction with real-time rendering performance.<n>While achieving superior performance in the presence of many views, 3DGS struggles with sparse view reconstruction, where the input views are sparse and do not fully cover the scene and have low overlaps.<n>In this paper, we propose DIP-GS, a Deep Image Prior (DIP) 3DGS representation.
arXiv Detail & Related papers (2025-08-10T14:47:32Z) - Gaussian Splatting Feature Fields for Privacy-Preserving Visual Localization [29.793562435104707]
We propose a scene representation for visual localization that combines an explicit geometry model (3DGS) with an implicit feature field.<n>We use a 3D structure-informed clustering procedure to regularize the representation learning and seamlessly convert the features to segmentations.<n>The resulting privacy- and non-privacy-preserving localization pipelines, evaluated on multiple real-world datasets, show state-of-the-art performances.
arXiv Detail & Related papers (2025-07-31T13:58:15Z) - SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation [9.77843053500054]
We propose SGLoc, a novel localization system that directly regresses camera poses from 3D Gaussian Splatting (3DGS) representation by leveraging semantic information.<n>Our method utilizes the semantic relationship between 2D image and 3D scene representation to estimate the 6DoF pose without prior pose information.
arXiv Detail & Related papers (2025-07-16T08:39:08Z) - GaussianVLM: Scene-centric 3D Vision-Language Models using Language-aligned Gaussian Splats for Embodied Reasoning and Beyond [56.677984098204696]
multimodal language models are driving the development of 3D Vision-Language Models (VLMs)<n>We propose a scene-centric 3D VLM for 3D Gaussian splat scenes that employs language- and task-aware scene representations.<n>We present the first Gaussian splatting-based VLM, leveraging photorealistic 3D representations derived from standard RGB images.
arXiv Detail & Related papers (2025-07-01T15:52:59Z) - SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting [104.83629308412958]
3D Gaussian Splatting (3DGS) serves as a highly performant and efficient encoding of scene geometry, appearance, and semantics.<n>We propose the first large-scale benchmark that systematically assesses three groups of methods directly in 3D space.<n>Results demonstrate a clear advantage of the generalizable paradigm, particularly in relaxing the scene-specific limitation.
arXiv Detail & Related papers (2025-06-10T11:52:45Z) - OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies [112.80292725951921]
textbfOVGaussian is a generalizable textbfOpen-textbfVocabulary 3D semantic segmentation framework based on the 3D textbfGaussian representation.<n>We first construct a large-scale 3D scene dataset based on 3DGS, dubbed textbfSegGaussian, which provides detailed semantic and instance annotations for both Gaussian points and multi-view images.<n>To promote semantic generalization across scenes, we introduce Generalizable Semantic Rasterization (GSR), which leverages a
arXiv Detail & Related papers (2024-12-31T07:55:35Z) - LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images [7.363332481155945]
This paper presents a vision-based localization pipeline utilizing the 3D Splatting (GS) technique as scene representation.
During the mapping phase, structure-from-motion (SfM) is applied first, followed by the generation of a GS map.
High-precision pose is achieved through the analysis-by manner on the map.
arXiv Detail & Related papers (2024-10-15T11:17:18Z) - GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS.<n>In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss.<n> Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z) - SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality [50.179377002092416]
We propose an efficient visual localization method capable of high-quality rendering with fewer parameters.
Our method achieves superior or comparable rendering and localization performance to state-of-the-art implicit-based visual localization approaches.
arXiv Detail & Related papers (2024-09-21T08:46:16Z) - LP-3DGS: Learning to Prune 3D Gaussian Splatting [71.97762528812187]
We propose learning-to-prune 3DGS, where a trainable binary mask is applied to the importance score that can find optimal pruning ratio automatically.
Experiments have shown that LP-3DGS consistently produces a good balance that is both efficient and high quality.
arXiv Detail & Related papers (2024-05-29T05:58:34Z) - SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D
Sequences [76.28527350263012]
We propose a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames.
We aggregate PointNet features from primitive scene components by means of a graph neural network.
Our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
arXiv Detail & Related papers (2021-03-27T13:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.