Generalizable Articulated Object Perception with Superpoints
- URL: http://arxiv.org/abs/2412.16656v1
- Date: Sat, 21 Dec 2024 14:57:24 GMT
- Title: Generalizable Articulated Object Perception with Superpoints
- Authors: Qiaojun Yu, Ce Hao, Xibin Yuan, Li Zhang, Liu Liu, Yukang Huo, Rohit Agarwal, Cewu Lu,
- Abstract summary: We introduce a novel superpoint-based perception method designed to improve part segmentation in 3D point clouds of articulated objects.
We propose a learnable, part-aware superpoint generation technique that efficiently groups points based on their geometric and semantic similarities.
Experimental results on the GAPartNet dataset show that our method outperforms existing state-of-the-art approaches in cross-category part segmentation.
- Score: 42.52926364769424
- License:
- Abstract: Manipulating articulated objects with robotic arms is challenging due to the complex kinematic structure, which requires precise part segmentation for efficient manipulation. In this work, we introduce a novel superpoint-based perception method designed to improve part segmentation in 3D point clouds of articulated objects. We propose a learnable, part-aware superpoint generation technique that efficiently groups points based on their geometric and semantic similarities, resulting in clearer part boundaries. Furthermore, by leveraging the segmentation capabilities of the 2D foundation model SAM, we identify the centers of pixel regions and select corresponding superpoints as candidate query points. Integrating a query-based transformer decoder further enhances our method's ability to achieve precise part segmentation. Experimental results on the GAPartNet dataset show that our method outperforms existing state-of-the-art approaches in cross-category part segmentation, achieving AP50 scores of 77.9% for seen categories (4.4% improvement) and $39.3\%$ for unseen categories (11.6% improvement), with superior results in 5 out of 9 part categories for seen objects and outperforming all previous methods across all part categories for unseen objects.
Related papers
- From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation [24.51617545483278]
We introduce a hierarchical transformer-based model designed for sophisticated image segmentation tasks.
At the heart of our approach is a multi-level representation strategy, which systematically advances from individual pixels to superpixels.
This architecture is underpinned by two pivotal aggregation strategies: local aggregation and global aggregation.
arXiv Detail & Related papers (2024-09-02T16:13:26Z) - Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and
Motion Estimation [49.56131393810713]
We present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner.
Our method excels in both model performance and computational efficiency, with only 0.25M parameters and 0.92G FLOPs.
arXiv Detail & Related papers (2023-06-08T22:55:32Z) - Towards Open-World Segmentation of Parts [16.056921233445784]
We propose to explore a class-agnostic part segmentation task.
We argue that models trained without part classes can better localize parts and segment them on objects unseen in training.
We show notable and consistent gains by our approach, essentially a critical step towards open-world part segmentation.
arXiv Detail & Related papers (2023-05-26T10:34:58Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - GAPartNet: Cross-Category Domain-Generalizable Object Perception and
Manipulation via Generalizable and Actionable Parts [28.922958261132475]
We learn cross-category skills via Generalizable and Actionable Parts (GAParts)
Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation.
Our method outperforms all existing methods by a large margin, no matter on seen or unseen categories.
arXiv Detail & Related papers (2022-11-10T00:30:22Z) - Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework [108.70949305791201]
Part-level Action Parsing (PAP) aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.
In particular, our framework first predicts the video-level class of the input video, then localizes the body parts and predicts the part-level action.
Our framework achieves state-of-the-art performance and outperforms existing methods over a 31.10% ROC score.
arXiv Detail & Related papers (2022-03-09T01:30:57Z) - 3D Compositional Zero-shot Learning with DeCompositional Consensus [102.7571947144639]
We argue that part knowledge should be composable beyond the observed object classes.
We present 3D Compositional Zero-shot Learning as a problem of part generalization from seen to unseen object classes.
arXiv Detail & Related papers (2021-11-29T16:34:53Z) - LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud
Segmentation [19.915593390338337]
This research proposes a learnable region growing method for class-agnostic point cloud segmentation.
The proposed method is able to segment any class of objects using a single deep neural network without any assumptions about their shapes and sizes.
arXiv Detail & Related papers (2021-03-16T15:58:01Z) - Interpretable and Accurate Fine-grained Recognition via Region Grouping [14.28113520947247]
We present an interpretable deep model for fine-grained visual recognition.
At the core of our method lies the integration of region-based part discovery and attribution within a deep neural network.
Our results compare favorably to state-of-the-art methods on classification tasks.
arXiv Detail & Related papers (2020-05-21T01:18:26Z) - PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation [111.7241018610573]
We present PointGroup, a new end-to-end bottom-up architecture for instance segmentation.
We design a two-branch network to extract point features and predict semantic labels and offsets, for shifting each point towards its respective instance centroid.
A clustering component is followed to utilize both the original and offset-shifted point coordinate sets, taking advantage of their complementary strength.
We conduct extensive experiments on two challenging datasets, ScanNet v2 and S3DIS, on which our method achieves the highest performance, 63.6% and 64.0%, compared to 54.9% and 54.4% achieved by former best
arXiv Detail & Related papers (2020-04-03T16:26:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.