SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation
- URL: http://arxiv.org/abs/2311.17707v1
- Date: Wed, 29 Nov 2023 15:11:03 GMT
- Title: SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation
- Authors: Mutian Xu, Xingyilang Yin, Lingteng Qiu, Yang Liu, Xin Tong, Xiaoguang
Han
- Abstract summary: We introduce SAMPro3D for zero-shot 3D indoor scene segmentation.
Our approach segments 3D scenes by applying the pretrained Segment Anything Model (SAM) to 2D frames.
Our method consistently achieves higher quality and more diverse segmentation than previous zero-shot or fully supervised approaches.
- Score: 26.207530327673748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce SAMPro3D for zero-shot 3D indoor scene segmentation. Given the
3D point cloud and multiple posed 2D frames of 3D scenes, our approach segments
3D scenes by applying the pretrained Segment Anything Model (SAM) to 2D frames.
Our key idea involves locating 3D points in scenes as natural 3D prompts to
align their projected pixel prompts across frames, ensuring frame-consistency
in both pixel prompts and their SAM-predicted masks. Moreover, we suggest
filtering out low-quality 3D prompts based on feedback from all 2D frames, for
enhancing segmentation quality. We also propose to consolidate different 3D
prompts if they are segmenting the same object, bringing a more comprehensive
segmentation. Notably, our method does not require any additional training on
domain-specific data, enabling us to preserve the zero-shot power of SAM.
Extensive qualitative and quantitative results show that our method
consistently achieves higher quality and more diverse segmentation than
previous zero-shot or fully supervised approaches, and in many cases even
surpasses human-level annotations. The project page can be accessed at
https://mutianxu.github.io/sampro3d/.
Related papers
- SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners [87.76470518069338]
We introduce SAM2Point, a preliminary exploration adapting Segment Anything Model 2 (SAM 2) for promptable 3D segmentation.
Our framework supports various prompt types, including 3D points, boxes, and masks, and can generalize across diverse scenarios, such as 3D objects, indoor scenes, sparse outdoor environments, and raw LiDAR.
To our best knowledge, we present the most faithful implementation of SAM in 3D, which may serve as a starting point for future research in promptable 3D segmentation.
arXiv Detail & Related papers (2024-08-29T17:59:45Z) - Point-SAM: Promptable 3D Segmentation Model for Point Clouds [25.98791840584803]
We propose a 3D promptable segmentation model (Point-SAM) focusing on point clouds.
Our approach utilizes a transformer-based method, extending SAM to the 3D domain.
Our model outperforms state-of-the-art models on several indoor and outdoor benchmarks.
arXiv Detail & Related papers (2024-06-25T17:28:03Z) - Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without
Manual Labels [141.23836433191624]
Current 3D scene segmentation methods are heavily dependent on manually annotated 3D training datasets.
We propose Segment3D, a method for class-agnostic 3D scene segmentation that produces high-quality 3D segmentation masks.
arXiv Detail & Related papers (2023-12-28T18:57:11Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - SAM-guided Graph Cut for 3D Instance Segmentation [60.75119991853605]
This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information.
We introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation.
Our method achieves robust segmentation performance and can generalize across different types of scenes.
arXiv Detail & Related papers (2023-12-13T18:59:58Z) - NTO3D: Neural Target Object 3D Reconstruction with Segment Anything [44.45486364580724]
We propose NTO3D, a novel high-quality Neural Target Object 3D (NTO3D) reconstruction method.
We first propose a novel strategy to lift the multi-view 2D segmentation masks of SAM into a unified 3D occupancy field.
The 3D occupancy field is then projected into 2D space and generates the new prompts for SAM.
NTO3D lifts the 2D masks and features of SAM into the 3D neural field for high-quality neural target object 3D reconstruction.
arXiv Detail & Related papers (2023-09-22T11:02:57Z) - SAM3D: Segment Anything in 3D Scenes [33.57040455422537]
We propose a novel framework that is able to predict masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB images without further training or finetuning.
For a point cloud of a 3D scene with posed RGB images, we first predict segmentation masks of RGB images with SAM, and then project the 2D masks into the 3D points.
Our approach is experimented with ScanNet dataset and qualitative results demonstrate that our SAM3D achieves reasonable and fine-grained 3D segmentation results without any training or finetuning.
arXiv Detail & Related papers (2023-06-06T17:59:51Z) - Segment Anything in 3D with Radiance Fields [83.14130158502493]
This paper generalizes the Segment Anything Model (SAM) to segment 3D objects.
We refer to the proposed solution as SA3D, short for Segment Anything in 3D.
We show in experiments that SA3D adapts to various scenes and achieves 3D segmentation within seconds.
arXiv Detail & Related papers (2023-04-24T17:57:15Z) - PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained
Image-Language Models [56.324516906160234]
Generalizable 3D part segmentation is important but challenging in vision and robotics.
This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP.
We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm.
arXiv Detail & Related papers (2022-12-03T06:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.