Related papers: GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting

GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting

URL: http://arxiv.org/abs/2411.07555v1
Date: Tue, 12 Nov 2024 05:09:42 GMT
Title: GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting
Authors: Umangi Jain, Ashkan Mirzaei, Igor Gilitschenski,
Abstract summary: We introduce GaussianCut, a new method for interactive multiview segmentation of scenes represented as 3D Gaussians. Our approach allows for selecting the objects to be segmented by interacting with a single view. It accepts intuitive user input, such as point clicks, coarse scribbles, or text.
Score: 7.392798832833857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce GaussianCut, a new method for interactive multiview segmentation of scenes represented as 3D Gaussians. Our approach allows for selecting the objects to be segmented by interacting with a single view. It accepts intuitive user input, such as point clicks, coarse scribbles, or text. Using 3D Gaussian Splatting (3DGS) as the underlying scene representation simplifies the extraction of objects of interest which are considered to be a subset of the scene's Gaussians. Our key idea is to represent the scene as a graph and use the graph-cut algorithm to minimize an energy function to effectively partition the Gaussians into foreground and background. To achieve this, we construct a graph based on scene Gaussians and devise a segmentation-aligned energy function on the graph to combine user inputs with scene properties. To obtain an initial coarse segmentation, we leverage 2D image/video segmentation models and further refine these coarse estimates using our graph construction. Our empirical evaluations show the adaptability of GaussianCut across a diverse set of scenes. GaussianCut achieves competitive performance with state-of-the-art approaches for 3D segmentation without requiring any additional segmentation-aware training.

Related papers

AG$^2$aussian: Anchor-Graph Structured Gaussian Splatting for Instance-Level 3D Scene Understanding and Editing [12.988814956246033]
3D Gaussian Splatting (3DGS) has witnessed exponential adoption across diverse applications, driving a critical need for semantic-aware representations.<n>Existing approaches typically attach semantic features to a collection of free Gaussians and distill the features via differentiable rendering.<n>We introduce AG$2$aussian, a novel framework that leverages an anchor-graph structure to organize semantic features and regulate Gaussian primitives.
arXiv Detail & Related papers (2025-08-03T12:47:30Z)
Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image Collections [16.91513979037418]
We propose Seg-Wild, an interactive segmentation method based on 3D Gaussian Splatting for unconstrained image collections.<n>We integrate multi-dimensional feature embeddings for each 3D Gaussian and calculate the feature similarity between the feature embeddings and the segmentation target.<n>We project the 3D Gaussians onto a 2D plane and calculate the ratio of 3D Gaussians that need to be cut using the SAM mask.
arXiv Detail & Related papers (2025-07-10T03:26:17Z)
Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images [12.274418254425019]
3D Gaussian Splatting (3DGS) has demonstrated impressive novel view synthesis performance. We propose Gaussian Graph Network (GGN) to generate efficient and generalizable Gaussian representations. We conduct experiments on the large-scale RealEstate10K and ACID datasets to demonstrate the efficiency and generalization of our method.
arXiv Detail & Related papers (2025-03-20T16:56:13Z)
GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding [20.578106363482018]
We propose a novel framework that enhances 3DGS-based scene understanding by integrating semantic clustering and scene graph generation. We introduce a "Control-Follow" clustering strategy, which dynamically adapts to scene scale and feature distribution, avoiding feature compression. We enrich scene representation by integrating object attributes and spatial relations extracted from 2D foundation models.
arXiv Detail & Related papers (2025-03-06T02:36:59Z)
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies [112.80292725951921]
textbfOVGaussian is a generalizable textbfOpen-textbfVocabulary 3D semantic segmentation framework based on the 3D textbfGaussian representation. We first construct a large-scale 3D scene dataset based on 3DGS, dubbed textbfSegGaussian, which provides detailed semantic and instance annotations for both Gaussian points and multi-view images. To promote semantic generalization across scenes, we introduce Generalizable Semantic Rasterization (GSR), which leverages a
arXiv Detail & Related papers (2024-12-31T07:55:35Z)
ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets. Our dataset ShapeSplat consists of 65K objects from 87 unique categories. We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z)
Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos [58.22272760132996]
We show that existing 4D Gaussian methods dramatically fail in this setup because the monocular setting is underconstrained. We propose Dynamic Gaussian Marbles, which consist of three core modifications that target the difficulties of the monocular setting. We evaluate on the Nvidia Dynamic Scenes dataset and the DyCheck iPhone dataset, and show that Gaussian Marbles significantly outperforms other Gaussian baselines in quality.
arXiv Detail & Related papers (2024-06-26T19:37:07Z)
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction [70.65250036489128]
3D semantic occupancy prediction aims to obtain 3D fine-grained geometry and semantics of the surrounding scene. We propose an object-centric representation to describe 3D scenes with sparse 3D semantic Gaussians. GaussianFormer achieves comparable performance with state-of-the-art methods with only 17.8% - 24.8% of their memory consumption.
arXiv Detail & Related papers (2024-05-27T17:59:51Z)
Contrastive Gaussian Clustering: Weakly Supervised 3D Scene Segmentation [14.967600484476385]
We introduce Contrastive Gaussian Clustering, a novel approach capable of provide segmentation masks from any viewpoint. Our method can be trained on inconsistent 2D segmentation masks, and still learn to generate segmentation masks consistent across all views. The resulting model is extremely accurate, improving the IoU accuracy of the predicted masks by $+8%$ over the state of the art.
arXiv Detail & Related papers (2024-04-19T10:47:53Z)
Learning Segmented 3D Gaussians via Efficient Feature Unprojection for Zero-shot Neural Scene Segmentation [16.57158278095853]
Zero-shot neural scene segmentation serves as an effective way for scene understanding. Existing models, especially the efficient 3D Gaussian-based methods, struggle to produce compact segmentation results. Our work proposes the Feature Unprojection and Fusion module as the segmentation field. We show that our model surpasses baselines on zero-shot semantic segmentation task, improving by 10% mIoU over the best baseline.
arXiv Detail & Related papers (2024-01-11T14:05:01Z)
2D-Guided 3D Gaussian Segmentation [15.139488857163064]
This paper introduces a 3D Gaussian segmentation method implemented with 2D segmentation as supervision. This approach uses input 2D segmentation maps to guide the learning of the added 3D Gaussian semantic information. Experiments show that our method can achieve comparable performances on mIOU and mAcc for multi-object segmentation.
arXiv Detail & Related papers (2023-12-26T13:28:21Z)
Segment Any 3D Gaussians [85.93694310363325]
This paper presents SAGA, a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS) Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms. We show that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-12-01T17:15:24Z)
Gaussian Grouping: Segment and Edit Anything in 3D Scenes [65.49196142146292]
We propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes. Compared to the implicit NeRF representation, we show that the grouped 3D Gaussians can reconstruct, segment and edit anything in 3D with high visual quality, fine granularity and efficiency.
arXiv Detail & Related papers (2023-12-01T17:09:31Z)
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences [76.28527350263012]
We propose a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames. We aggregate PointNet features from primitive scene components by means of a graph neural network. Our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
arXiv Detail & Related papers (2021-03-27T13:00:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.