Where2Explore: Few-shot Affordance Learning for Unseen Novel Categories
of Articulated Objects
- URL: http://arxiv.org/abs/2309.07473v2
- Date: Fri, 15 Dec 2023 13:36:46 GMT
- Title: Where2Explore: Few-shot Affordance Learning for Unseen Novel Categories
of Articulated Objects
- Authors: Chuanruo Ning, Ruihai Wu, Haoran Lu, Kaichun Mo, Hao Dong
- Abstract summary: 'Where2Explore' is a framework that effectively explores novel categories with minimal interactions on a limited number of instances.
Our framework explicitly estimates the geometric similarity across different categories, identifying local areas that differ from shapes in the training categories for efficient exploration.
- Score: 15.989258402792755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Articulated object manipulation is a fundamental yet challenging task in
robotics. Due to significant geometric and semantic variations across object
categories, previous manipulation models struggle to generalize to novel
categories. Few-shot learning is a promising solution for alleviating this
issue by allowing robots to perform a few interactions with unseen objects.
However, extant approaches often necessitate costly and inefficient test-time
interactions with each unseen instance. Recognizing this limitation, we observe
that despite their distinct shapes, different categories often share similar
local geometries essential for manipulation, such as pullable handles and
graspable edges - a factor typically underutilized in previous few-shot
learning works. To harness this commonality, we introduce 'Where2Explore', an
affordance learning framework that effectively explores novel categories with
minimal interactions on a limited number of instances. Our framework explicitly
estimates the geometric similarity across different categories, identifying
local areas that differ from shapes in the training categories for efficient
exploration while concurrently transferring affordance knowledge to similar
parts of the objects. Extensive experiments in simulated and real-world
environments demonstrate our framework's capacity for efficient few-shot
exploration and generalization.
Related papers
- Kinematic-aware Prompting for Generalizable Articulated Object
Manipulation with LLMs [53.66070434419739]
Generalizable articulated object manipulation is essential for home-assistant robots.
We propose a kinematic-aware prompting framework that prompts Large Language Models with kinematic knowledge of objects to generate low-level motion waypoints.
Our framework outperforms traditional methods on 8 categories seen and shows a powerful zero-shot capability for 8 unseen articulated object categories.
arXiv Detail & Related papers (2023-11-06T03:26:41Z) - Endogenous Macrodynamics in Algorithmic Recourse [52.87956177581998]
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely focused on single individuals in a static environment.
We show that many of the existing methodologies can be collectively described by a generalized framework.
We then argue that the existing framework does not account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level.
arXiv Detail & Related papers (2023-08-16T07:36:58Z) - Grasp Transfer based on Self-Aligning Implicit Representations of Local
Surfaces [10.602143478315861]
This work addresses the problem of transferring a grasp experience or a demonstration to a novel object that shares shape similarities with objects the robot has previously encountered.
We employ a single expert grasp demonstration to learn an implicit local surface representation model from a small dataset of object meshes.
At inference time, this model is used to transfer grasps to novel objects by identifying the most geometrically similar surfaces to the one on which the expert grasp is demonstrated.
arXiv Detail & Related papers (2023-08-15T14:33:17Z) - PartManip: Learning Cross-Category Generalizable Part Manipulation
Policy from Point Cloud Observations [12.552149411655355]
We build the first large-scale, part-based cross-category object manipulation benchmark, PartManip.
We train a state-based expert with our proposed part-based canonicalization and part-aware rewards, and then distill the knowledge to a vision-based student.
For cross-category generalization, we introduce domain adversarial learning for domain-invariant feature extraction.
arXiv Detail & Related papers (2023-03-29T18:29:30Z) - Efficient Representations of Object Geometry for Reinforcement Learning
of Interactive Grasping Policies [29.998917158604694]
We present a reinforcement learning framework that learns the interactive grasping of various geometrically distinct real-world objects.
Videos of learned interactive policies are available at https://maltemosbach.org/io/geometry_aware_grasping_policies.
arXiv Detail & Related papers (2022-11-20T11:47:33Z) - Inferring Versatile Behavior from Demonstrations by Matching Geometric
Descriptors [72.62423312645953]
Humans intuitively solve tasks in versatile ways, varying their behavior in terms of trajectory-based planning and for individual steps.
Current Imitation Learning algorithms often only consider unimodal expert demonstrations and act in a state-action-based setting.
Instead, we combine a mixture of movement primitives with a distribution matching objective to learn versatile behaviors that match the expert's behavior and versatility.
arXiv Detail & Related papers (2022-10-17T16:42:59Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated
Objects via Few-shot Interactions [13.802675708793014]
Perceiving and interacting with 3D articulated objects, such as cabinets, doors, and faucets, pose particular challenges for future home-assistant robots.
We propose a novel framework, named AdaAfford, that learns to perform very few test-time interactions for quickly adapting the affordance priors to more accurate instance-specific posteriors.
arXiv Detail & Related papers (2021-12-01T03:00:05Z) - Simultaneous Multi-View Object Recognition and Grasping in Open-Ended
Domains [0.0]
We propose a deep learning architecture with augmented memory capacities to handle open-ended object recognition and grasping simultaneously.
We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings.
arXiv Detail & Related papers (2021-06-03T14:12:11Z) - Closing the Generalization Gap in One-Shot Object Detection [92.82028853413516]
We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories.
Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
arXiv Detail & Related papers (2020-11-09T09:31:17Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.