Contrastive Gaussian Clustering: Weakly Supervised 3D Scene Segmentation
- URL: http://arxiv.org/abs/2404.12784v1
- Date: Fri, 19 Apr 2024 10:47:53 GMT
- Title: Contrastive Gaussian Clustering: Weakly Supervised 3D Scene Segmentation
- Authors: Myrna C. Silva, Mahtab Dahaghin, Matteo Toso, Alessio Del Bue,
- Abstract summary: We introduce Contrastive Gaussian Clustering, a novel approach capable of provide segmentation masks from any viewpoint.
Our method can be trained on inconsistent 2D segmentation masks, and still learn to generate segmentation masks consistent across all views.
The resulting model is extremely accurate, improving the IoU accuracy of the predicted masks by $+8%$ over the state of the art.
- Score: 14.967600484476385
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce Contrastive Gaussian Clustering, a novel approach capable of provide segmentation masks from any viewpoint and of enabling 3D segmentation of the scene. Recent works in novel-view synthesis have shown how to model the appearance of a scene via a cloud of 3D Gaussians, and how to generate accurate images from a given viewpoint by projecting on it the Gaussians before $\alpha$ blending their color. Following this example, we train a model to include also a segmentation feature vector for each Gaussian. These can then be used for 3D scene segmentation, by clustering Gaussians according to their feature vectors; and to generate 2D segmentation masks, by projecting the Gaussians on a plane and $\alpha$ blending over their segmentation features. Using a combination of contrastive learning and spatial regularization, our method can be trained on inconsistent 2D segmentation masks, and still learn to generate segmentation masks consistent across all views. Moreover, the resulting model is extremely accurate, improving the IoU accuracy of the predicted masks by $+8\%$ over the state of the art. Code and trained models will be released soon.
Related papers
- Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing [27.24794829116753]
We address the challenge of lifting 2D visual segmentation to 3D in Gaussian Splatting.<n>Existing methods often suffer from inconsistent 2D masks across viewpoints and produce noisy segmentation boundaries.<n>We introduce Gaussian Instance Tracing (GIT), which augments the standard Gaussian representation with an instance weight matrix across input views.
arXiv Detail & Related papers (2025-08-05T08:54:17Z) - Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image Collections [16.91513979037418]
We propose Seg-Wild, an interactive segmentation method based on 3D Gaussian Splatting for unconstrained image collections.<n>We integrate multi-dimensional feature embeddings for each 3D Gaussian and calculate the feature similarity between the feature embeddings and the segmentation target.<n>We project the 3D Gaussians onto a 2D plane and calculate the ratio of 3D Gaussians that need to be cut using the SAM mask.
arXiv Detail & Related papers (2025-07-10T03:26:17Z) - GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction [55.60972844777044]
3D semantic occupancy prediction is an important task for robust vision-centric autonomous driving.<n>Most existing methods leverage dense grid-based scene representations, overlooking the spatial sparsity of the driving scenes.<n>We propose a probabilistic Gaussian superposition model which interprets each Gaussian as a probability distribution of its neighborhood being occupied.
arXiv Detail & Related papers (2024-12-05T17:59:58Z) - NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images.
We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z) - GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting [7.392798832833857]
We introduce GaussianCut, a new method for interactive multiview segmentation of scenes represented as 3D Gaussians.
Our approach allows for selecting the objects to be segmented by interacting with a single view.
It accepts intuitive user input, such as point clicks, coarse scribbles, or text.
arXiv Detail & Related papers (2024-11-12T05:09:42Z) - No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes [39.687526103092445]
We show that a simple yet effective aggregation technique yields excellent results.
We extend this method to generic DINOv2 features, integrating 3D scene geometry through graph diffusion, and achieve competitive segmentation results.
arXiv Detail & Related papers (2024-10-18T13:44:29Z) - Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks [6.647959476396794]
3D Gaussian Splatting has emerged as a powerful 3D scene representation technique, capturing fine details with high efficiency.
In this paper, we introduce a novel voting-based method that extends 2D segmentation models to 3D Gaussian splats.
The robust yet straightforward mathematical formulation underlying this approach makes it a highly effective tool for numerous downstream applications.
arXiv Detail & Related papers (2024-09-18T03:45:44Z) - ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction [70.65250036489128]
3D semantic occupancy prediction aims to obtain 3D fine-grained geometry and semantics of the surrounding scene.
We propose an object-centric representation to describe 3D scenes with sparse 3D semantic Gaussians.
GaussianFormer achieves comparable performance with state-of-the-art methods with only 17.8% - 24.8% of their memory consumption.
arXiv Detail & Related papers (2024-05-27T17:59:51Z) - Learning Segmented 3D Gaussians via Efficient Feature Unprojection for Zero-shot Neural Scene Segmentation [16.57158278095853]
Zero-shot neural scene segmentation serves as an effective way for scene understanding.
Existing models, especially the efficient 3D Gaussian-based methods, struggle to produce compact segmentation results.
Our work proposes the Feature Unprojection and Fusion module as the segmentation field.
We show that our model surpasses baselines on zero-shot semantic segmentation task, improving by 10% mIoU over the best baseline.
arXiv Detail & Related papers (2024-01-11T14:05:01Z) - 2D-Guided 3D Gaussian Segmentation [15.139488857163064]
This paper introduces a 3D Gaussian segmentation method implemented with 2D segmentation as supervision.
This approach uses input 2D segmentation maps to guide the learning of the added 3D Gaussian semantic information.
Experiments show that our method can achieve comparable performances on mIOU and mAcc for multi-object segmentation.
arXiv Detail & Related papers (2023-12-26T13:28:21Z) - SplatMesh: Interactive 3D Segmentation and Editing Using Mesh-Based Gaussian Splatting [86.50200613220674]
A key challenge in 3D-based interactive editing is the absence of an efficient representation that balances diverse modifications with high-quality view synthesis under a given memory constraint.
We introduce SplatMesh, a novel fine-grained interactive 3D segmentation and editing algorithm that integrates 3D Gaussian Splatting with a precomputed mesh.
By segmenting and editing the simplified mesh, we can effectively edit the Gaussian splats as well, which will lead to extensive experiments on real and synthetic datasets.
arXiv Detail & Related papers (2023-12-26T02:50:42Z) - Segment Any 3D Gaussians [85.93694310363325]
This paper presents SAGA, a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS)
Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms.
We show that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-12-01T17:15:24Z) - Gaussian Grouping: Segment and Edit Anything in 3D Scenes [65.49196142146292]
We propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes.
Compared to the implicit NeRF representation, we show that the grouped 3D Gaussians can reconstruct, segment and edit anything in 3D with high visual quality, fine granularity and efficiency.
arXiv Detail & Related papers (2023-12-01T17:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.