Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM
- URL: http://arxiv.org/abs/2409.16944v1
- Date: Wed, 25 Sep 2024 13:56:08 GMT
- Title: Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM
- Authors: Phu Pham, Dipam Patel, Damon Conover, Aniket Bera,
- Abstract summary: Go-SLAM is a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments.
Our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions.
- Score: 12.934788858420752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. This framework employs advanced object segmentation techniques, assigning a unique identifier to each Gaussian splat that corresponds to the object it represents. Consequently, our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions. Furthermore, the framework features an optimal path generation module that calculates efficient navigation paths for robots toward queried objects, considering obstacles and environmental uncertainties. Comprehensive evaluations in various scene settings demonstrate the effectiveness of our approach in delivering high-fidelity scene reconstructions, precise object segmentation, flexible object querying, and efficient robot path planning. This work represents an additional step forward in bridging the gap between 3D scene reconstruction, semantic object understanding, and real-time environment interactions.
Related papers
- DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments [0.0]
We propose DENSER, a framework that significantly enhances the representation of dynamic objects.
The proposed approach significantly outperforms state-of-the-art methods by a wide margin.
arXiv Detail & Related papers (2024-09-16T07:11:58Z) - Object-Oriented Material Classification and 3D Clustering for Improved Semantic Perception and Mapping in Mobile Robots [6.395242048226456]
We propose a complement-aware deep learning approach for RGB-D-based material classification built on top of an object-oriented pipeline.
We show a significant improvement in material classification and 3D clustering accuracy compared to state-of-the-art approaches for 3D semantic scene mapping.
arXiv Detail & Related papers (2024-07-08T16:25:01Z) - Cognitive Planning for Object Goal Navigation using Generative AI Models [0.979851640406258]
We present a novel framework for solving the object goal navigation problem that generates efficient exploration strategies.
Our approach enables a robot to navigate unfamiliar environments by leveraging Large Language Models (LLMs) and Large Vision-Language Models (LVLMs)
arXiv Detail & Related papers (2024-03-30T10:54:59Z) - NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial
Understanding with Objects [53.111397800478294]
We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects.
NeuSE serves as a compact point cloud surrogate for complete object models.
Our proposed SLAM paradigm, using NeuSE for object shape and pose characterization, can operate independently or in conjunction with typical SLAM systems.
arXiv Detail & Related papers (2023-03-13T17:30:43Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - ImpDet: Exploring Implicit Fields for 3D Object Detection [74.63774221984725]
We introduce a new perspective that views bounding box regression as an implicit function.
This leads to our proposed framework, termed Implicit Detection or ImpDet.
Our ImpDet assigns specific values to points in different local 3D spaces, thereby high-quality boundaries can be generated.
arXiv Detail & Related papers (2022-03-31T17:52:12Z) - RICE: Refining Instance Masks in Cluttered Environments with Graph
Neural Networks [53.15260967235835]
We propose a novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks.
We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the segmentations.
We demonstrate an application that uses uncertainty estimates generated by our method to guide a manipulator, leading to efficient understanding of cluttered scenes.
arXiv Detail & Related papers (2021-06-29T20:29:29Z) - Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model
Alignments [81.38641691636847]
We rethink the problem of scene reconstruction from an embodied agent's perspective.
We reconstruct an interactive scene using RGB-D data stream.
This reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models.
arXiv Detail & Related papers (2021-03-30T05:56:58Z) - Compositional Scalable Object SLAM [29.349829139625403]
We present a fast, scalable, and accurate Simultaneous Localization and Mapping (SLAM) system that represents indoor scenes as a graph of objects.
We show that a compositional scalable object mapping formulation is amenable to a robust SLAM solution for drift-free large scale indoor reconstruction.
arXiv Detail & Related papers (2020-11-05T04:46:25Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z) - Where Does It End? -- Reasoning About Hidden Surfaces by Object
Intersection Constraints [6.653734987585243]
Co-Section is an optimization-based approach to 3D dynamic scene reconstruction.
An object-level dynamic SLAM detects segments, tracks and maps dynamic objects in the scene.
arXiv Detail & Related papers (2020-04-09T16:18:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.