PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View
Instance Segmentation and Maximum Likelihood Estimation
- URL: http://arxiv.org/abs/2312.03015v1
- Date: Tue, 5 Dec 2023 01:33:04 GMT
- Title: PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View
Instance Segmentation and Maximum Likelihood Estimation
- Authors: Yuchen Zhou and Jiayuan Gu and Xuanlin Li and Minghua Liu and Yunhao
Fang and Hao Su
- Abstract summary: PartSLIP, a recent advancement, has made significant strides in zero- and few-shot 3D part segmentation.
We introduce PartSLIP++, an enhanced version designed to overcome the limitations of its predecessor.
We show that PartSLIP++ demonstrates better performance over PartSLIP in both low-shot 3D semantic and instance-based object part segmentation tasks.
- Score: 32.2861030554128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-world 3D part segmentation is pivotal in diverse applications such as
robotics and AR/VR. Traditional supervised methods often grapple with limited
3D data availability and struggle to generalize to unseen object categories.
PartSLIP, a recent advancement, has made significant strides in zero- and
few-shot 3D part segmentation. This is achieved by harnessing the capabilities
of the 2D open-vocabulary detection module, GLIP, and introducing a heuristic
method for converting and lifting multi-view 2D bounding box predictions into
3D segmentation masks. In this paper, we introduce PartSLIP++, an enhanced
version designed to overcome the limitations of its predecessor. Our approach
incorporates two major improvements. First, we utilize a pre-trained 2D
segmentation model, SAM, to produce pixel-wise 2D segmentations, yielding more
precise and accurate annotations than the 2D bounding boxes used in PartSLIP.
Second, PartSLIP++ replaces the heuristic 3D conversion process with an
innovative modified Expectation-Maximization algorithm. This algorithm
conceptualizes 3D instance segmentation as unobserved latent variables, and
then iteratively refines them through an alternating process of 2D-3D matching
and optimization with gradient descent. Through extensive evaluations, we show
that PartSLIP++ demonstrates better performance over PartSLIP in both low-shot
3D semantic and instance-based object part segmentation tasks. Code released at
https://github.com/zyc00/PartSLIP2.
Related papers
- SA3DIP: Segment Any 3D Instance with Potential 3D Priors [41.907914881608995]
We propose SA3DIP, a novel method for Segmenting Any 3D Instances via exploiting potential 3D Priors.
Specifically, on one hand, we generate complementary 3D primitives based on both geometric and textural priors.
On the other hand, we introduce supplemental constraints from the 3D space by using a 3D detector to guide a further merging process.
arXiv Detail & Related papers (2024-11-06T10:39:00Z) - 3x2: 3D Object Part Segmentation by 2D Semantic Correspondences [33.99493183183571]
We propose to leverage a few annotated 3D shapes or richly annotated 2D datasets to perform 3D object part segmentation.
We present our novel approach, termed 3-By-2 that achieves SOTA performance on different benchmarks with various granularity levels.
arXiv Detail & Related papers (2024-07-12T19:08:00Z) - Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without
Manual Labels [141.23836433191624]
Current 3D scene segmentation methods are heavily dependent on manually annotated 3D training datasets.
We propose Segment3D, a method for class-agnostic 3D scene segmentation that produces high-quality 3D segmentation masks.
arXiv Detail & Related papers (2023-12-28T18:57:11Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - SAM-guided Graph Cut for 3D Instance Segmentation [60.75119991853605]
This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information.
We introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation.
Our method achieves robust segmentation performance and can generalize across different types of scenes.
arXiv Detail & Related papers (2023-12-13T18:59:58Z) - Segment Any 3D Gaussians [85.93694310363325]
This paper presents SAGA, a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS)
Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms.
We show that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods.
arXiv Detail & Related papers (2023-12-01T17:15:24Z) - A One Stop 3D Target Reconstruction and multilevel Segmentation Method [0.0]
We propose an open-source one stop 3D target reconstruction and multilevel segmentation framework (OSTRA)
OSTRA performs segmentation on 2D images, tracks multiple instances with segmentation labels in the image sequence, and then reconstructs labelled 3D objects or multiple parts with Multi-View Stereo (MVS) or RGBD-based 3D reconstruction methods.
Our method opens up a new avenue for reconstructing 3D targets embedded with rich multi-scale segmentation information in complex scenes.
arXiv Detail & Related papers (2023-08-14T07:12:31Z) - PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained
Image-Language Models [56.324516906160234]
Generalizable 3D part segmentation is important but challenging in vision and robotics.
This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP.
We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm.
arXiv Detail & Related papers (2022-12-03T06:59:01Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.