SAM3D: Segment Anything in 3D Scenes
- URL: http://arxiv.org/abs/2306.03908v1
- Date: Tue, 6 Jun 2023 17:59:51 GMT
- Title: SAM3D: Segment Anything in 3D Scenes
- Authors: Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu
- Abstract summary: We propose a novel framework that is able to predict masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB images without further training or finetuning.
For a point cloud of a 3D scene with posed RGB images, we first predict segmentation masks of RGB images with SAM, and then project the 2D masks into the 3D points.
Our approach is experimented with ScanNet dataset and qualitative results demonstrate that our SAM3D achieves reasonable and fine-grained 3D segmentation results without any training or finetuning.
- Score: 33.57040455422537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose SAM3D, a novel framework that is able to predict
masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB
images without further training or finetuning. For a point cloud of a 3D scene
with posed RGB images, we first predict segmentation masks of RGB images with
SAM, and then project the 2D masks into the 3D points. Later, we merge the 3D
masks iteratively with a bottom-up merging approach. At each step, we merge the
point cloud masks of two adjacent frames with the bidirectional merging
approach. In this way, the 3D masks predicted from different frames are
gradually merged into the 3D masks of the whole 3D scene. Finally, we can
optionally ensemble the result from our SAM3D with the over-segmentation
results based on the geometric information of the 3D scenes. Our approach is
experimented with ScanNet dataset and qualitative results demonstrate that our
SAM3D achieves reasonable and fine-grained 3D segmentation results without any
training or finetuning of SAM.
Related papers
- SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners [87.76470518069338]
We introduce SAM2Point, a preliminary exploration adapting Segment Anything Model 2 (SAM 2) for promptable 3D segmentation.
Our framework supports various prompt types, including 3D points, boxes, and masks, and can generalize across diverse scenarios, such as 3D objects, indoor scenes, sparse outdoor environments, and raw LiDAR.
To our best knowledge, we present the most faithful implementation of SAM in 3D, which may serve as a starting point for future research in promptable 3D segmentation.
arXiv Detail & Related papers (2024-08-29T17:59:45Z) - EmbodiedSAM: Online Segment Any 3D Thing in Real Time [61.2321497708998]
Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration.
An online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed.
arXiv Detail & Related papers (2024-08-21T17:57:06Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation [26.207530327673748]
We introduce SAMPro3D for zero-shot 3D indoor scene segmentation.
Our approach segments 3D scenes by applying the pretrained Segment Anything Model (SAM) to 2D frames.
Our method consistently achieves higher quality and more diverse segmentation than previous zero-shot or fully supervised approaches.
arXiv Detail & Related papers (2023-11-29T15:11:03Z) - NTO3D: Neural Target Object 3D Reconstruction with Segment Anything [44.45486364580724]
We propose NTO3D, a novel high-quality Neural Target Object 3D (NTO3D) reconstruction method.
We first propose a novel strategy to lift the multi-view 2D segmentation masks of SAM into a unified 3D occupancy field.
The 3D occupancy field is then projected into 2D space and generates the new prompts for SAM.
NTO3D lifts the 2D masks and features of SAM into the 3D neural field for high-quality neural target object 3D reconstruction.
arXiv Detail & Related papers (2023-09-22T11:02:57Z) - TomoSAM: a 3D Slicer extension using SAM for tomography segmentation [62.997667081978825]
TomoSAM has been developed to integrate the cutting-edge Segment Anything Model (SAM) into 3D Slicer.
SAM is a promptable deep learning model that is able to identify objects and create image masks in a zero-shot manner.
The synergy between these tools aids in the segmentation of complex 3D datasets from tomography or other imaging techniques.
arXiv Detail & Related papers (2023-06-14T16:13:27Z) - Segment Anything in 3D with Radiance Fields [83.14130158502493]
This paper generalizes the Segment Anything Model (SAM) to segment 3D objects.
We refer to the proposed solution as SA3D, short for Segment Anything in 3D.
We show in experiments that SA3D adapts to various scenes and achieves 3D segmentation within seconds.
arXiv Detail & Related papers (2023-04-24T17:57:15Z) - Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud
Pre-training [65.75399500494343]
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for 2D and 3D computer vision.
We propose Joint-MAE, a 2D-3D joint MAE framework for self-supervised 3D point cloud pre-training.
arXiv Detail & Related papers (2023-02-27T17:56:18Z) - 3D Instance Segmentation of MVS Buildings [5.2517244720510305]
We present a novel framework for instance segmentation of 3D buildings from Multi-view Stereo (MVS) urban scenes.
The emphasis of this work lies in detecting and segmenting 3D building instances even if they are attached and embedded in a large and imprecise 3D surface model.
arXiv Detail & Related papers (2021-12-18T11:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.