GaussExplorer: 3D Gaussian Splatting for Embodied Exploration and Reasoning
- URL: http://arxiv.org/abs/2601.13132v1
- Date: Mon, 19 Jan 2026 15:17:58 GMT
- Title: GaussExplorer: 3D Gaussian Splatting for Embodied Exploration and Reasoning
- Authors: Kim Yu-Ji, Dahye Lee, Kim Jun-Seong, GeonU Kim, Nam Hyeon-Woo, Yongjin Kwon, Yu-Chiang Frank Wang, Jaesung Choe, Tae-Hyun Oh,
- Abstract summary: GaussExplorer is a framework for embodied exploration and reasoning built on 3D Gaussian Splatting (3DGS)<n>We introduce Vision-Language Models (VLMs) on top of 3DGS to enable question-driven exploration and reasoning within 3D scenes.
- Score: 55.826192239140596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present GaussExplorer, a framework for embodied exploration and reasoning built on 3D Gaussian Splatting (3DGS). While prior approaches to language-embedded 3DGS have made meaningful progress in aligning simple text queries with Gaussian embeddings, they are generally optimized for relatively simple queries and struggle to interpret more complex, compositional language queries. Alternative studies based on object-centric RGB-D structured memories provide spatial grounding but are constrained by pre-fixed viewpoints. To address these issues, GaussExplorer introduces Vision-Language Models (VLMs) on top of 3DGS to enable question-driven exploration and reasoning within 3D scenes. We first identify pre-captured images that are most correlated with the query question, and subsequently adjust them into novel viewpoints to more accurately capture visual information for better reasoning by VLMs. Experiments show that ours outperforms existing methods on several benchmarks, demonstrating the effectiveness of integrating VLM-based reasoning with 3DGS for embodied tasks.
Related papers
- GVSynergy-Det: Synergistic Gaussian-Voxel Representations for Multi-View 3D Object Detection [18.809986709717446]
Image-based 3D object detection aims to identify and localize objects in 3D space using only RGB images.<n>Existing image-based approaches face two critical challenges: methods achieving high accuracy typically require dense 3D supervision.<n>We present GVSynergy-Det, a novel framework that enhances 3D detection through synergistic Gaussian-Voxel representation learning.
arXiv Detail & Related papers (2025-12-29T03:34:39Z) - A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation [66.62489208150681]
3D Gaussian Splatting (3DGS) has emerged as a powerful alternative to Neural Radiance Fields (NeRF) for 3D scene representation.<n>This survey provides a comprehensive overview of recent progress in 3DGS applications.
arXiv Detail & Related papers (2025-08-13T17:44:39Z) - ReferSplat: Referring Segmentation in 3D Gaussian Splatting [60.73702075842278]
Referring 3D Gaussian Splatting (R3DGS)<n>Task aims to segment target objects in a 3D Gaussian scene based on natural language descriptions.<n>To address these challenges, we propose ReferSplat, a framework that explicitly models 3D Gaussian points with natural language expressions.
arXiv Detail & Related papers (2025-08-11T17:59:30Z) - SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting [104.83629308412958]
3D Gaussian Splatting (3DGS) serves as a highly performant and efficient encoding of scene geometry, appearance, and semantics.<n>We propose the first large-scale benchmark that systematically assesses three groups of methods directly in 3D space.<n>Results demonstrate a clear advantage of the generalizable paradigm, particularly in relaxing the scene-specific limitation.
arXiv Detail & Related papers (2025-06-10T11:52:45Z) - Occam's LGS: An Efficient Approach for Language Gaussian Splatting [57.00354758206751]
We show that the complicated pipelines for language 3D Gaussian Splatting are simply unnecessary.<n>We apply Occam's razor to the task at hand, leading to a highly efficient weighted multi-view feature aggregation technique.
arXiv Detail & Related papers (2024-12-02T18:50:37Z) - GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane [53.388937705785025]
3D open-vocabulary scene understanding is crucial for advancing augmented reality and robotic applications.
We introduce GOI, a framework that integrates semantic features from 2D vision-language foundation models into 3D Gaussian Splatting (3DGS)
Our method treats the feature selection process as a hyperplane division within the feature space, retaining only features that are highly relevant to the query.
arXiv Detail & Related papers (2024-05-27T18:57:18Z) - AbsGS: Recovering Fine Details for 3D Gaussian Splatting [10.458776364195796]
3D Gaussian Splatting (3D-GS) technique couples 3D primitives with differentiable Gaussianization to achieve high-quality novel view results.
However, 3D-GS frequently suffers from over-reconstruction issue in intricate scenes containing high-frequency details, leading to blurry rendered images.
We present a comprehensive analysis of the cause of aforementioned artifacts, namely gradient collision.
Our strategy efficiently identifies large Gaussians in over-reconstructed regions, and recovers fine details by splitting.
arXiv Detail & Related papers (2024-04-16T11:44:12Z) - Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting [27.974762304763694]
We introduce Semantic Gaussians, a novel open-vocabulary scene understanding approach based on 3D Gaussian Splatting.
Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features into a novel semantic component of 3D Gaussians.
We build a 3D semantic network that directly predicts the semantic component from raw 3D Gaussians for fast inference.
arXiv Detail & Related papers (2024-03-22T21:28:19Z) - SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition [66.56357905500512]
3D Gaussian Splatting has emerged as an alternative 3D representation for novel view synthesis.<n>We propose SAGD, a conceptually simple yet effective boundary-enhanced segmentation pipeline for 3D-GS.<n>Our approach achieves high-quality 3D segmentation without rough boundary issues, which can be easily applied to other scene editing tasks.
arXiv Detail & Related papers (2024-01-31T14:19:03Z) - A Survey on 3D Gaussian Splatting [61.50539646390613]
3D Gaussian splatting (GS) has emerged as a transformative technique in radiance fields.<n>We provide the first systematic overview of the recent developments and critical contributions in 3D GS.<n>By enabling unprecedented rendering speed, 3D GS opens up a plethora of applications, ranging from virtual reality to interactive media and beyond.
arXiv Detail & Related papers (2024-01-08T13:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.