SAM-guided Graph Cut for 3D Instance Segmentation
- URL: http://arxiv.org/abs/2312.08372v3
- Date: Fri, 2 Aug 2024 09:56:14 GMT
- Title: SAM-guided Graph Cut for 3D Instance Segmentation
- Authors: Haoyu Guo, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu, Xiaowei Zhou,
- Abstract summary: This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information.
We introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation.
Our method achieves robust segmentation performance and can generalize across different types of scenes.
- Score: 60.75119991853605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information. Many previous works have applied deep learning techniques to 3D point clouds for instance segmentation. However, these methods often failed to generalize to various types of scenes due to the scarcity and low-diversity of labeled 3D point cloud data. Some recent works have attempted to lift 2D instance segmentations to 3D within a bottom-up framework. The inconsistency in 2D instance segmentations among views can substantially degrade the performance of 3D segmentation. In this work, we introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation. Specifically, we pre-segment the scene into several superpoints in 3D, formulating the task into a graph cut problem. The superpoint graph is constructed based on 2D segmentation models, where node features are obtained from multi-view image features and edge weights are computed based on multi-view segmentation results, enabling the better generalization ability. To process the graph, we train a graph neural network using pseudo 3D labels from 2D segmentation models. Experimental results on the ScanNet, ScanNet++ and KITTI-360 datasets demonstrate that our method achieves robust segmentation performance and can generalize across different types of scenes. Our project page is available at https://zju3dv.github.io/sam_graph.
Related papers
- 3x2: 3D Object Part Segmentation by 2D Semantic Correspondences [33.99493183183571]
We propose to leverage a few annotated 3D shapes or richly annotated 2D datasets to perform 3D object part segmentation.
We present our novel approach, termed 3-By-2 that achieves SOTA performance on different benchmarks with various granularity levels.
arXiv Detail & Related papers (2024-07-12T19:08:00Z) - PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction [23.798691661418253]
We propose a novel zero-shot panoptic reconstruction method from RGB-D images of scenes.
We tackle both challenges by propagating partial labels with the aid of dense generalized features.
Our method outperforms state-of-the-art methods on the indoor dataset ScanNet V2 and the outdoor dataset KITTI-360.
arXiv Detail & Related papers (2024-07-01T15:06:04Z) - View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene.
Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output.
We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z) - Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without
Manual Labels [141.23836433191624]
Current 3D scene segmentation methods are heavily dependent on manually annotated 3D training datasets.
We propose Segment3D, a method for class-agnostic 3D scene segmentation that produces high-quality 3D segmentation masks.
arXiv Detail & Related papers (2023-12-28T18:57:11Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - Segment Any 3D Gaussians [85.93694310363325]
This paper presents SAGA, a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS)
Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms.
We show that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-12-01T17:15:24Z) - UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes [35.38074724231105]
UnScene3D is a fully unsupervised 3D learning approach for class-agnostic 3D instance segmentation of indoor scans.
We operate on a basis of geometric oversegmentation, enabling efficient representation and learning on high-resolution 3D data.
Our approach improves over state-of-the-art unsupervised 3D instance segmentation methods by more than 300% Average Precision score.
arXiv Detail & Related papers (2023-03-25T19:15:16Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - Interactive Object Segmentation in 3D Point Clouds [27.88495480980352]
We present an interactive 3D object segmentation method in which the user interacts directly with the 3D point cloud.
Our model does not require training data from the target domain.
It performs well on several other datasets with different data characteristics as well as different object classes.
arXiv Detail & Related papers (2022-04-14T18:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.