CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians with
Dual Feature Fusion
- URL: http://arxiv.org/abs/2401.05925v3
- Date: Tue, 30 Jan 2024 12:46:04 GMT
- Title: CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians with
Dual Feature Fusion
- Authors: Bin Dou, Tianyu Zhang, Yongjia Ma, Zhaohui Wang, Zejian Yuan
- Abstract summary: We propose a method for compact 3D-consistent scene segmentation at fast rendering speed with only RGB images input.
Our model outperforms baselines on both semantic and panoptic zero-shot segmentation task.
- Score: 17.778755539808547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Compact and Swift Segmenting 3D Gaussians(CoSSegGaussians), a
method for compact 3D-consistent scene segmentation at fast rendering speed
with only RGB images input. Previous NeRF-based segmentation methods have
relied on time-consuming neural scene optimization. While recent 3D Gaussian
Splatting has notably improved speed, existing Gaussian-based segmentation
methods struggle to produce compact masks, especially in zero-shot
segmentation. This issue probably stems from their straightforward assignment
of learnable parameters to each Gaussian, resulting in a lack of robustness
against cross-view inconsistent 2D machine-generated labels. Our method aims to
address this problem by employing Dual Feature Fusion Network as Gaussians'
segmentation field. Specifically, we first optimize 3D Gaussians under RGB
supervision. After Gaussian Locating, DINO features extracted from images are
applied through explicit unprojection, which are further incorporated with
spatial features from the efficient point cloud processing network. Feature
aggregation is utilized to fuse them in a global-to-local strategy for compact
segmentation features. Experimental results show that our model outperforms
baselines on both semantic and panoptic zero-shot segmentation task, meanwhile
consumes less than 10% inference time compared to NeRF-based methods. Code and
more results will be available at https://David-Dou.github.io/CoSSegGaussians
Related papers
- Click-Gaussian: Interactive Segmentation to Any 3D Gaussians [2.8461293457421957]
We propose Click-Gaussian, which learns distinguishable feature fields of two-level granularity.
Our method runs in 10 ms per click, 15 to 130 times as fast as the previous methods.
arXiv Detail & Related papers (2024-07-16T14:49:27Z) - RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting [51.51310922527121]
We present a real-time 3D reconstruction system with an RGBD camera for large-scale environments using Gaussian splatting.
We force each Gaussian to be either opaque or nearly transparent, with the opaque ones fitting the surface and dominant colors, and transparent ones fitting residual colors.
We show real-time reconstructions of a variety of large scenes and show superior performance in the realism of novel view synthesis and camera tracking accuracy.
arXiv Detail & Related papers (2024-04-30T16:54:59Z) - Contrastive Gaussian Clustering: Weakly Supervised 3D Scene Segmentation [14.967600484476385]
We introduce Contrastive Gaussian Clustering, a novel approach capable of provide segmentation masks from any viewpoint.
Our method can be trained on inconsistent 2D segmentation masks, and still learn to generate segmentation masks consistent across all views.
The resulting model is extremely accurate, improving the IoU accuracy of the predicted masks by $+8%$ over the state of the art.
arXiv Detail & Related papers (2024-04-19T10:47:53Z) - HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression [55.6351304553003]
3D Gaussian Splatting (3DGS) has emerged as a promising framework for novel view synthesis.
We propose a Hash-grid Assisted Context (HAC) framework for highly compact 3DGS representation.
Our work is the pioneer to explore context-based compression for 3DGS representation, resulting in a remarkable size reduction of over $75times$ compared to vanilla 3DGS.
arXiv Detail & Related papers (2024-03-21T16:28:58Z) - GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering [112.16239342037714]
GES (Generalized Exponential Splatting) is a novel representation that employs Generalized Exponential Function (GEF) to model 3D scenes.
With the aid of a frequency-modulated loss, GES achieves competitive performance in novel-view synthesis benchmarks.
arXiv Detail & Related papers (2024-02-15T17:32:50Z) - SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition [66.80822249039235]
3D Gaussian Splatting has emerged as an alternative 3D representation for novel view synthesis.
We propose SAGD, a conceptually simple yet effective boundary-enhanced segmentation pipeline for 3D-GS.
Our approach achieves high-quality 3D segmentation without rough boundary issues, which can be easily applied to other scene editing tasks.
arXiv Detail & Related papers (2024-01-31T14:19:03Z) - 2D-Guided 3D Gaussian Segmentation [15.139488857163064]
This paper introduces a 3D Gaussian segmentation method implemented with 2D segmentation as supervision.
This approach uses input 2D segmentation maps to guide the learning of the added 3D Gaussian semantic information.
Experiments show that our method can achieve comparable performances on mIOU and mAcc for multi-object segmentation.
arXiv Detail & Related papers (2023-12-26T13:28:21Z) - Compact 3D Scene Representation via Self-Organizing Gaussian Grids [10.816451552362823]
3D Gaussian Splatting has recently emerged as a highly promising technique for modeling of static 3D scenes.
We introduce a compact scene representation organizing the parameters of 3DGS into a 2D grid with local homogeneity.
Our method achieves a reduction factor of 17x to 42x in size for complex scenes with no increase in training time.
arXiv Detail & Related papers (2023-12-19T20:18:29Z) - Gaussian Grouping: Segment and Edit Anything in 3D Scenes [65.49196142146292]
We propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes.
Compared to the implicit NeRF representation, we show that the grouped 3D Gaussians can reconstruct, segment and edit anything in 3D with high visual quality, fine granularity and efficiency.
arXiv Detail & Related papers (2023-12-01T17:09:31Z) - Compact 3D Gaussian Representation for Radiance Field [14.729871192785696]
We propose a learnable mask strategy to reduce the number of 3D Gaussian points without sacrificing performance.
We also propose a compact but effective representation of view-dependent color by employing a grid-based neural field.
Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2023-11-22T20:31:16Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.