Scene-Generalizable Interactive Segmentation of Radiance Fields
- URL: http://arxiv.org/abs/2308.05104v1
- Date: Wed, 9 Aug 2023 17:55:50 GMT
- Title: Scene-Generalizable Interactive Segmentation of Radiance Fields
- Authors: Songlin Tang, Wenjie Pei, Xin Tao, Tanghui Jia, Guangming Lu, Yu-Wing
Tai
- Abstract summary: We make the first attempt at Scene-Generalizable Interactive in Radiance Fields (SGISRF)
We propose a novel SGISRF method, which can perform 3D object segmentation for novel (unseen) scenes represented by radiance fields, guided by only a few interactive user clicks in a given set of multi-view 2D images.
Experiments on two real-world challenging benchmarks covering diverse scenes demonstrate 1) effectiveness and scene-generalizability of the proposed method, 2) favorable performance compared to classical method requiring scene-specific optimization.
- Score: 64.37093918762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing methods for interactive segmentation in radiance fields entail
scene-specific optimization and thus cannot generalize across different scenes,
which greatly limits their applicability. In this work we make the first
attempt at Scene-Generalizable Interactive Segmentation in Radiance Fields
(SGISRF) and propose a novel SGISRF method, which can perform 3D object
segmentation for novel (unseen) scenes represented by radiance fields, guided
by only a few interactive user clicks in a given set of multi-view 2D images.
In particular, the proposed SGISRF focuses on addressing three crucial
challenges with three specially designed techniques. First, we devise the
Cross-Dimension Guidance Propagation to encode the scarce 2D user clicks into
informative 3D guidance representations. Second, the Uncertainty-Eliminated 3D
Segmentation module is designed to achieve efficient yet effective 3D
segmentation. Third, Concealment-Revealed Supervised Learning scheme is
proposed to reveal and correct the concealed 3D segmentation errors resulted
from the supervision in 2D space with only 2D mask annotations. Extensive
experiments on two real-world challenging benchmarks covering diverse scenes
demonstrate 1) effectiveness and scene-generalizability of the proposed method,
2) favorable performance compared to classical method requiring scene-specific
optimization.
Related papers
- Mastering Regional 3DGS: Locating, Initializing, and Editing with Diverse 2D Priors [67.22744959435708]
3D semantic parsing often underperforms compared to its 2D counterpart, making targeted manipulations within 3D spaces more difficult and limiting the fidelity of edits.<n>We address this problem by leveraging 2D diffusion editing to accurately identify modification regions in each view, followed by inverse rendering for 3D localization.<n> Experiments demonstrate that our method achieves state-of-the-art performance while delivering up to a $4times$ speedup.
arXiv Detail & Related papers (2025-07-07T19:15:43Z) - Segment Any 3D-Part in a Scene from a Sentence [50.46950922754459]
This paper aims to achieve the segmentation of any 3D part in a scene based on natural language descriptions.<n>We introduce the 3D-PU dataset, the first large-scale 3D dataset with dense part annotations.<n>On the methodological side, we propose OpenPart3D, a 3D-input-only framework to tackle the challenges of part-level segmentation.
arXiv Detail & Related papers (2025-06-24T05:51:22Z) - iSegMan: Interactive Segment-and-Manipulate 3D Gaussians [5.746109453405226]
iSegMan is an interactive segmentation and manipulation framework that only requires simple 2D user interactions in any view.<n>Epipolar-guided Interaction propagation (EIP) exploits epipolar constraint for efficient and robust interaction matching.<n> Visibility-based Gaussian Voting (VGV) obtains 2D segmentations from SAM and models the region extraction as a voting game.
arXiv Detail & Related papers (2025-05-17T09:41:10Z) - econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians [56.85804719947]
We propose econSG for open-vocabulary semantic segmentation with 3DGS.
Our econSG shows state-of-the-art performance on four benchmark datasets compared to the existing methods.
arXiv Detail & Related papers (2025-04-08T13:12:31Z) - Enforcing View-Consistency in Class-Agnostic 3D Segmentation Fields [46.711276257688326]
Radiance Fields have become a powerful tool for modeling 3D scenes from multiple images.
Some methods work well using 2D semantic masks, but they generalize poorly to class-agnostic segmentations.
More recent methods circumvent this issue by using contrastive learning to optimize a high-dimensional 3D feature field instead.
arXiv Detail & Related papers (2024-08-19T12:07:24Z) - Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation [19.2297264550686]
Open-vocabulary 3D instance segmentation transcends traditional closed-vocabulary methods.
We introduce Zero-Shot Dual-Path Integration Framework that equally values the contributions of both 3D and 2D modalities.
Our framework, utilizing pre-trained models in a zero-shot manner, is model-agnostic and demonstrates superior performance on both seen and unseen data.
arXiv Detail & Related papers (2024-08-16T07:52:00Z) - iSeg: Interactive 3D Segmentation via Interactive Attention [14.036050263210182]
We present iSeg, a new interactive technique for segmenting 3D shapes.
We propose a novel interactive attention module capable of processing different numbers and types of clicks.
We apply iSeg to a myriad of shapes from different domains, demonstrating its versatility and faithfulness to the user's specifications.
arXiv Detail & Related papers (2024-04-04T05:54:19Z) - SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields [92.14328581392633]
We introduce a novel fine-grained interactive 3D segmentation and editing algorithm with radiance fields, which we refer to as SERF.
Our method entails creating a neural mesh representation by integrating multi-view algorithms with pre-trained 2D models.
Building upon this representation, we introduce a novel surface rendering technique that preserves local information and is robust to deformation.
arXiv Detail & Related papers (2023-12-26T02:50:42Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Geometry Aware Field-to-field Transformations for 3D Semantic
Segmentation [48.307734886370014]
We present a novel approach to perform 3D semantic segmentation solely from 2D supervision by leveraging Neural Radiance Fields (NeRFs)
By extracting features along a surface point cloud, we achieve a compact representation of the scene which is sample-efficient and conducive to 3D reasoning.
arXiv Detail & Related papers (2023-10-08T11:48:19Z) - ONeRF: Unsupervised 3D Object Segmentation from Multiple Views [59.445957699136564]
ONeRF is a method that automatically segments and reconstructs object instances in 3D from multi-view RGB images without any additional manual annotations.
The segmented 3D objects are represented using separate Neural Radiance Fields (NeRFs) which allow for various 3D scene editing and novel view rendering.
arXiv Detail & Related papers (2022-11-22T06:19:37Z) - Unsupervised Multi-View Object Segmentation Using Radiance Field
Propagation [55.9577535403381]
We present a novel approach to segmenting objects in 3D during reconstruction given only unlabeled multi-view images of a scene.
The core of our method is a novel propagation strategy for individual objects' radiance fields with a bidirectional photometric loss.
To the best of our knowledge, RFP is the first unsupervised approach for tackling 3D scene object segmentation for neural radiance field (NeRF)
arXiv Detail & Related papers (2022-10-02T11:14:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.