GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding
- URL: http://arxiv.org/abs/2503.04034v1
- Date: Thu, 06 Mar 2025 02:36:59 GMT
- Title: GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding
- Authors: Xihan Wang, Dianyi Yang, Yu Gao, Yufeng Yue, Yi Yang, Mengyin Fu,
- Abstract summary: We propose a novel framework that enhances 3DGS-based scene understanding by integrating semantic clustering and scene graph generation.<n>We introduce a "Control-Follow" clustering strategy, which dynamically adapts to scene scale and feature distribution, avoiding feature compression.<n>We enrich scene representation by integrating object attributes and spatial relations extracted from 2D foundation models.
- Score: 20.578106363482018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in 3D Gaussian Splatting(3DGS) have significantly improved semantic scene understanding, enabling natural language queries to localize objects within a scene. However, existing methods primarily focus on embedding compressed CLIP features to 3D Gaussians, suffering from low object segmentation accuracy and lack spatial reasoning capabilities. To address these limitations, we propose GaussianGraph, a novel framework that enhances 3DGS-based scene understanding by integrating adaptive semantic clustering and scene graph generation. We introduce a "Control-Follow" clustering strategy, which dynamically adapts to scene scale and feature distribution, avoiding feature compression and significantly improving segmentation accuracy. Additionally, we enrich scene representation by integrating object attributes and spatial relations extracted from 2D foundation models. To address inaccuracies in spatial relationships, we propose 3D correction modules that filter implausible relations through spatial consistency verification, ensuring reliable scene graph construction. Extensive experiments on three datasets demonstrate that GaussianGraph outperforms state-of-the-art methods in both semantic segmentation and object grounding tasks, providing a robust solution for complex scene understanding and interaction.
Related papers
- CAGS: Open-Vocabulary 3D Scene Understanding with Context-Aware Gaussian Splatting [18.581169318975046]
3D Gaussian Splatting (3DGS) offers a powerful representation for scene reconstruction, but cross-view granularity inconsistency is a problem.
We propose Context-Aware Gaussian Splatting (CAGS), a novel framework that addresses this challenge by incorporating spatial context into 3DGS.
CAGS significantly improves 3D instance segmentation and reduces fragmentation errors on datasets like LERF-OVS and ScanNet.
arXiv Detail & Related papers (2025-04-16T09:20:03Z) - COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting [67.03992455145325]
3D segmentation based on 3D Gaussian Splatting (3DGS) struggles with accurately delineating object boundaries.
We introduce Clear Object Boundaries for 3DGS (COB-GS), which aims to improve segmentation accuracy.
For semantic guidance, we introduce a boundary-adaptive Gaussian splitting technique.
For the visual optimization, we rectify the degraded texture of the 3DGS scene.
arXiv Detail & Related papers (2025-03-25T08:31:43Z) - OpenGS-SLAM: Open-Set Dense Semantic SLAM with 3D Gaussian Splatting for Object-Level Scene Understanding [20.578106363482018]
OpenGS-SLAM is an innovative framework that utilizes 3D Gaussian representation to perform dense semantic SLAM in open-set environments.<n>Our system integrates explicit semantic labels derived from 2D models into the 3D Gaussian framework, facilitating robust 3D object-level understanding.<n>Our method achieves 10 times faster semantic rendering and 2 times lower storage costs compared to existing methods.
arXiv Detail & Related papers (2025-03-03T15:23:21Z) - TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views [18.050257821756148]
TSGaussian is a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in novel view synthesis tasks.<n>Our approach prioritizes computational resources on designated targets while minimizing background allocation.<n>Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets.
arXiv Detail & Related papers (2024-12-13T11:26:38Z) - GradiSeg: Gradient-Guided Gaussian Segmentation with Enhanced 3D Boundary Precision [11.99904956714193]
We propose a novel 3DGS-based framework named GradiSeg, incorporating Identity sification to construct a deeper semantic understanding of scenes.<n>Our approach introduces two key modules: Identity Gradient Guided Densification (IGD) and Local Adaptive K-Nearest Neighbors (LA-KNN)<n>Results show that GradiS effectively addresses boundary-related issues, significantly improving segmentation accuracy without compromising scene reconstruction quality.
arXiv Detail & Related papers (2024-11-30T08:07:37Z) - Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding [59.51535163599723]
FreeGS is an unsupervised semantic-embedded 3DGS framework that achieves view-consistent 3D scene understanding without the need for 2D labels.
We show that FreeGS performs comparably to state-of-the-art methods while avoiding the complex data preprocessing workload.
arXiv Detail & Related papers (2024-11-29T08:52:32Z) - InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception [17.530797215534456]
3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality.<n>We propose InstanceGaussian, a method that jointly learns appearance and semantic features while adaptively aggregating instances.<n>Our approach achieves state-of-the-art performance in category-agnostic, open-vocabulary 3D point-level segmentation.
arXiv Detail & Related papers (2024-11-28T16:08:36Z) - ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane [53.388937705785025]
3D open-vocabulary scene understanding is crucial for advancing augmented reality and robotic applications.
We introduce GOI, a framework that integrates semantic features from 2D vision-language foundation models into 3D Gaussian Splatting (3DGS)
Our method treats the feature selection process as a hyperplane division within the feature space, retaining only features that are highly relevant to the query.
arXiv Detail & Related papers (2024-05-27T18:57:18Z) - GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction [70.65250036489128]
3D semantic occupancy prediction aims to obtain 3D fine-grained geometry and semantics of the surrounding scene.
We propose an object-centric representation to describe 3D scenes with sparse 3D semantic Gaussians.
GaussianFormer achieves comparable performance with state-of-the-art methods with only 17.8% - 24.8% of their memory consumption.
arXiv Detail & Related papers (2024-05-27T17:59:51Z) - Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting [27.974762304763694]
We introduce Semantic Gaussians, a novel open-vocabulary scene understanding approach based on 3D Gaussian Splatting.
Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features into a novel semantic component of 3D Gaussians.
We build a 3D semantic network that directly predicts the semantic component from raw 3D Gaussians for fast inference.
arXiv Detail & Related papers (2024-03-22T21:28:19Z) - SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition [66.56357905500512]
3D Gaussian Splatting has emerged as an alternative 3D representation for novel view synthesis.<n>We propose SAGD, a conceptually simple yet effective boundary-enhanced segmentation pipeline for 3D-GS.<n>Our approach achieves high-quality 3D segmentation without rough boundary issues, which can be easily applied to other scene editing tasks.
arXiv Detail & Related papers (2024-01-31T14:19:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.