AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated
Objects via Few-shot Interactions
- URL: http://arxiv.org/abs/2112.00246v6
- Date: Thu, 4 May 2023 14:47:16 GMT
- Title: AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated
Objects via Few-shot Interactions
- Authors: Yian Wang, Ruihai Wu, Kaichun Mo, Jiaqi Ke, Qingnan Fan, Leonidas
Guibas, Hao Dong
- Abstract summary: Perceiving and interacting with 3D articulated objects, such as cabinets, doors, and faucets, pose particular challenges for future home-assistant robots.
We propose a novel framework, named AdaAfford, that learns to perform very few test-time interactions for quickly adapting the affordance priors to more accurate instance-specific posteriors.
- Score: 13.802675708793014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perceiving and interacting with 3D articulated objects, such as cabinets,
doors, and faucets, pose particular challenges for future home-assistant robots
performing daily tasks in human environments. Besides parsing the articulated
parts and joint parameters, researchers recently advocate learning manipulation
affordance over the input shape geometry which is more task-aware and
geometrically fine-grained. However, taking only passive observations as
inputs, these methods ignore many hidden but important kinematic constraints
(e.g., joint location and limits) and dynamic factors (e.g., joint friction and
restitution), therefore losing significant accuracy for test cases with such
uncertainties. In this paper, we propose a novel framework, named AdaAfford,
that learns to perform very few test-time interactions for quickly adapting the
affordance priors to more accurate instance-specific posteriors. We conduct
large-scale experiments using the PartNet-Mobility dataset and prove that our
system performs better than baselines.
Related papers
- Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - Ins-HOI: Instance Aware Human-Object Interactions Recovery [44.02128629239429]
We propose an end-to-end Instance-aware Human-Object Interactions recovery (Ins-HOI) framework.
Ins-HOI supports instance-level reconstruction and provides reasonable and realistic invisible contact surfaces.
We collect a large-scale, high-fidelity 3D scan dataset, including 5.2k high-quality scans with real-world human-chair and hand-object interactions.
arXiv Detail & Related papers (2023-12-15T09:30:47Z) - Where2Explore: Few-shot Affordance Learning for Unseen Novel Categories
of Articulated Objects [15.989258402792755]
'Where2Explore' is a framework that effectively explores novel categories with minimal interactions on a limited number of instances.
Our framework explicitly estimates the geometric similarity across different categories, identifying local areas that differ from shapes in the training categories for efficient exploration.
arXiv Detail & Related papers (2023-09-14T07:11:58Z) - InterDiff: Generating 3D Human-Object Interactions with Physics-Informed
Diffusion [29.25063155767897]
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs)
Our task is significantly more challenging, as it requires modeling dynamic objects with various shapes, capturing whole-body motion, and ensuring physically valid interactions.
Experiments on multiple human-object interaction datasets demonstrate the effectiveness of our method for this task, capable of producing realistic, vivid, and remarkably long-term 3D HOI predictions.
arXiv Detail & Related papers (2023-08-31T17:59:08Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - Chairs Can be Stood on: Overcoming Object Bias in Human-Object
Interaction Detection [22.3445174577181]
Human-Object Interaction (HOI) in images is an important step towards high-level visual comprehension.
We propose a novel plug-and-play Object-wise Debiasing Memory (ODM) method for re-balancing the distribution of interactions under detected objects.
Our method brings consistent and significant improvements over baselines, especially on rare interactions under each object.
arXiv Detail & Related papers (2022-07-06T01:55:28Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - 3D_DEN: Open-ended 3D Object Recognition using Dynamically Expandable
Networks [0.0]
We propose a new deep transfer learning approach based on a dynamic architectural method to make robots capable of open-ended learning about new 3D object categories.
Experimental results showed that the proposed model outperformed state-of-the-art approaches with regards to accuracy and also substantially minimizes computational overhead.
arXiv Detail & Related papers (2020-09-15T16:44:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.