Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D
- URL: http://arxiv.org/abs/2408.13679v1
- Date: Sat, 24 Aug 2024 22:05:04 GMT
- Title: Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D
- Authors: George Tang, William Zhao, Logan Ford, David Benhaim, Paul Zhang,
- Abstract summary: We propose Segment Any Mesh (SAMesh), a novel zero-shot method for mesh part segmentation.
SAMesh operates in two phases: multimodal rendering and 2D-to-3D lifting.
We compare our method with a robust, well-evaluated shape analysis method, ShapeDiam, and show our method is comparable to or exceeds its performance.
- Score: 1.6427658855248815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Segment Any Mesh (SAMesh), a novel zero-shot method for mesh part segmentation that overcomes the limitations of shape analysis-based, learning-based, and current zero-shot approaches. SAMesh operates in two phases: multimodal rendering and 2D-to-3D lifting. In the first phase, multiview renders of the mesh are individually processed through Segment Anything 2 (SAM2) to generate 2D masks. These masks are then lifted into a mesh part segmentation by associating masks that refer to the same mesh part across the multiview renders. We find that applying SAM2 to multimodal feature renders of normals and shape diameter scalars achieves better results than using only untextured renders of meshes. By building our method on top of SAM2, we seamlessly inherit any future improvements made to 2D segmentation. We compare our method with a robust, well-evaluated shape analysis method, Shape Diameter Function (ShapeDiam), and show our method is comparable to or exceeds its performance. Since current benchmarks contain limited object diversity, we also curate and release a dataset of generated meshes and use it to demonstrate our method's improved generalization over ShapeDiam via human evaluation. We release the code and dataset at https://github.com/gtangg12/samesh
Related papers
- LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes [39.687526103092445]
We show that a simple yet effective aggregation technique yields excellent results.
We extend this method to generic DINOv2 features, integrating 3D scene geometry through graph diffusion, and achieve competitive segmentation results.
arXiv Detail & Related papers (2024-10-18T13:44:29Z) - MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis [27.703204488877038]
MeshSegmenter is a framework designed for zero-shot 3D semantic segmentation.
It delivers accurate 3D segmentation across diverse meshes and segment descriptions.
arXiv Detail & Related papers (2024-07-18T16:50:59Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - Propagating Semantic Labels in Video Data [0.0]
This work presents a method for performing segmentation for objects in video.
Once an object has been found in a frame of video, the segment can then be propagated to future frames.
The method works by combining SAM with Structure from Motion.
arXiv Detail & Related papers (2023-10-01T20:32:26Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - SAM3D: Segment Anything in 3D Scenes [33.57040455422537]
We propose a novel framework that is able to predict masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB images without further training or finetuning.
For a point cloud of a 3D scene with posed RGB images, we first predict segmentation masks of RGB images with SAM, and then project the 2D masks into the 3D points.
Our approach is experimented with ScanNet dataset and qualitative results demonstrate that our SAM3D achieves reasonable and fine-grained 3D segmentation results without any training or finetuning.
arXiv Detail & Related papers (2023-06-06T17:59:51Z) - Segment Anything in 3D with Radiance Fields [83.14130158502493]
This paper generalizes the Segment Anything Model (SAM) to segment 3D objects.
We refer to the proposed solution as SA3D, short for Segment Anything in 3D.
We show in experiments that SA3D adapts to various scenes and achieves 3D segmentation within seconds.
arXiv Detail & Related papers (2023-04-24T17:57:15Z) - Mask3D: Mask Transformer for 3D Semantic Instance Segmentation [89.41640045953378]
We show that we can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds.
Using Transformer decoders, the instance queries are learned by iteratively attending to point cloud features at multiple scales.
Mask3D sets a new state-of-the-art on ScanNet test (+6.2 mAP), S3DIS 6-fold (+10.1 mAP),LS3D (+11.2 mAP) and ScanNet200 test (+12.4 mAP)
arXiv Detail & Related papers (2022-10-06T17:55:09Z) - RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained
Features [53.71163467683838]
RefineMask is a new method for high-quality instance segmentation of objects and scenes.
It incorporates fine-grained features during the instance-wise segmenting process in a multi-stage manner.
It succeeds in segmenting hard cases such as bent parts of objects that are over-smoothed by most previous methods.
arXiv Detail & Related papers (2021-04-17T15:09:20Z) - AutoSweep: Recovering 3D Editable Objectsfrom a Single Photograph [54.701098964773756]
We aim to recover 3D objects with semantic parts and can be directly edited.
Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders.
Our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction.
arXiv Detail & Related papers (2020-05-27T12:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.