SAI3D: Segment Any Instance in 3D Scenes
- URL: http://arxiv.org/abs/2312.11557v2
- Date: Sun, 24 Mar 2024 07:01:37 GMT
- Title: SAI3D: Segment Any Instance in 3D Scenes
- Authors: Yingda Yin, Yuzheng Liu, Yang Xiao, Daniel Cohen-Or, Jingwei Huang, Baoquan Chen,
- Abstract summary: We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
- Score: 68.57002591841034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advancements in 3D instance segmentation have traditionally been tethered to the availability of annotated datasets, limiting their application to a narrow spectrum of object categories. Recent efforts have sought to harness vision-language models like CLIP for open-set semantic reasoning, yet these methods struggle to distinguish between objects of the same categories and rely on specific prompts that are not universally applicable. In this paper, we introduce SAI3D, a novel zero-shot 3D instance segmentation approach that synergistically leverages geometric priors and semantic cues derived from Segment Anything Model (SAM). Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations that are consistent with the multi-view SAM masks. Moreover, we design a hierarchical region-growing algorithm with a dynamic thresholding mechanism, which largely improves the robustness of finegrained 3D scene parsing.Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach. Notably, SAI3D outperforms existing open-vocabulary baselines and even surpasses fully-supervised methods in class-agnostic segmentation on ScanNet++. Our project page is at https://yd-yin.github.io/SAI3D.
Related papers
- SA3DIP: Segment Any 3D Instance with Potential 3D Priors [41.907914881608995]
We propose SA3DIP, a novel method for Segmenting Any 3D Instances via exploiting potential 3D Priors.
Specifically, on one hand, we generate complementary 3D primitives based on both geometric and textural priors.
On the other hand, we introduce supplemental constraints from the 3D space by using a 3D detector to guide a further merging process.
arXiv Detail & Related papers (2024-11-06T10:39:00Z) - Search3D: Hierarchical Open-Vocabulary 3D Segmentation [78.47704793095669]
Open-vocabulary 3D segmentation enables the exploration of 3D spaces using free-form text descriptions.
We introduce Search3D, an approach that builds a hierarchical open-vocabulary 3D scene representation.
Our method aims to expand the capabilities of open vocabulary instance-level 3D segmentation by shifting towards a more flexible open-vocabulary 3D search setting.
arXiv Detail & Related papers (2024-09-27T03:44:07Z) - Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation [50.51125319374404]
We propose a novel self-training network InsTeacher3D to explore and exploit pure instance knowledge from unlabeled data.
Experimental results on multiple large-scale datasets show that the InsTeacher3D significantly outperforms prior state-of-the-art semi-supervised approaches.
arXiv Detail & Related papers (2024-06-24T16:35:58Z) - Segment Any 3D Object with Language [58.471327490684295]
We introduce Segment any 3D Object with LanguagE (SOLE), a semantic geometric and-aware visual-language learning framework with strong generalizability.
Specifically, we propose a multimodal fusion network to incorporate multimodal semantics in both backbone and decoder.
Our SOLE outperforms previous methods by a large margin on ScanNetv2, ScanNet200, and Replica benchmarks.
arXiv Detail & Related papers (2024-04-02T17:59:10Z) - Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance [49.14140194332482]
We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance within 3D scenes.
Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task.
arXiv Detail & Related papers (2023-12-17T10:07:03Z) - SAM-guided Graph Cut for 3D Instance Segmentation [60.75119991853605]
This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information.
We introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation.
Our method achieves robust segmentation performance and can generalize across different types of scenes.
arXiv Detail & Related papers (2023-12-13T18:59:58Z) - UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes [35.38074724231105]
UnScene3D is a fully unsupervised 3D learning approach for class-agnostic 3D instance segmentation of indoor scans.
We operate on a basis of geometric oversegmentation, enabling efficient representation and learning on high-resolution 3D data.
Our approach improves over state-of-the-art unsupervised 3D instance segmentation methods by more than 300% Average Precision score.
arXiv Detail & Related papers (2023-03-25T19:15:16Z) - Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with
Deep Metric Learning [5.699350798684963]
We propose a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning.
For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects.
We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.
arXiv Detail & Related papers (2020-07-07T02:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.