Cross-Embodied Affordance Transfer through Learning Affordance Equivalences
- URL: http://arxiv.org/abs/2404.15648v2
- Date: Thu, 10 Oct 2024 01:18:06 GMT
- Title: Cross-Embodied Affordance Transfer through Learning Affordance Equivalences
- Authors: Hakan Aktas, Yukie Nagai, Minoru Asada, Matteo Saveriano, Erhan Oztop, Emre Ugur,
- Abstract summary: We propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space.
Our model does not learn the behavior of individual objects acted upon by a single agent.
Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots.
- Score: 6.828097734917722
- License:
- Abstract: Affordances represent the inherent effect and action possibilities that objects offer to the agents within a given context. From a theoretical viewpoint, affordances bridge the gap between effect and action, providing a functional understanding of the connections between the actions of an agent and its environment in terms of the effects it can cause. In this study, we propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space. Using the affordance space, our system can generate effect trajectories when action and object are given and can generate action trajectories when effect trajectories and objects are given. Our model does not learn the behavior of individual objects acted upon by a single agent. Still, rather, it forms a `shared affordance representation' spanning multiple agents and objects, which we call Affordance Equivalence. Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots. In addition to the simulation experiments that demonstrate the proposed model's range of capabilities, we also showcase that our model can be used for direct imitation in real-world settings.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered.
Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics.
We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Fine-grained Affordance Annotation for Egocentric Hand-Object
Interaction Videos [27.90643693526274]
Object affordance provides information on action possibilities based on human motor capacity and objects' physical property.
This paper proposes an efficient annotation scheme to address these issues by combining goal-irrelevant motor actions and grasp types as affordance labels.
We provide new annotations by applying this scheme to the EPIC-KITCHENS dataset and test our annotation with tasks such as affordance recognition, hand-object interaction hotspots prediction, and cross-domain evaluation of affordance.
arXiv Detail & Related papers (2023-02-07T07:05:00Z) - Object-Centric Scene Representations using Active Inference [4.298360054690217]
Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment.
We propose a novel approach for scene understanding, leveraging a hierarchical object-centric generative model that enables an agent to infer object category.
For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint.
arXiv Detail & Related papers (2023-02-07T06:45:19Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - Improving Object Permanence using Agent Actions and Reasoning [8.847502932609737]
Existing approaches learn object permanence from low-level perception.
We argue that object permanence can be improved when the robot uses knowledge about executed actions.
arXiv Detail & Related papers (2021-10-01T07:09:49Z) - Plug and Play, Model-Based Reinforcement Learning [60.813074750879615]
We introduce an object-based representation that allows zero-shot integration of new objects from known object classes.
This is achieved by representing the global transition dynamics as a union of local transition functions.
Experiments show that our representation can achieve sample-efficiency in a variety of set-ups.
arXiv Detail & Related papers (2021-08-20T01:20:15Z) - Property-Aware Robot Object Manipulation: a Generative Approach [57.70237375696411]
In this work, we focus on how to generate robot motion adapted to the hidden properties of the manipulated objects.
We explore the possibility of leveraging Generative Adversarial Networks to synthesize new actions coherent with the properties of the object.
Our results show that Generative Adversarial Nets can be a powerful tool for the generation of novel and meaningful transportation actions.
arXiv Detail & Related papers (2021-06-08T14:15:36Z) - Object and Relation Centric Representations for Push Effect Prediction [18.990827725752496]
Pushing is an essential non-prehensile manipulation skill used for tasks ranging from pre-grasp manipulation to scene rearrangement.
We propose a graph neural network based framework for effect prediction and parameter estimation of pushing actions.
Our framework is validated both in real and simulated environments containing different shaped multi-part objects connected via different types of joints and objects with different masses.
arXiv Detail & Related papers (2021-02-03T15:09:12Z) - Human and Machine Action Prediction Independent of Object Information [1.0806206850043696]
We study the role of inter-object relations that change during an action.
We predict actions in, on average, less than 64% of the action's duration.
arXiv Detail & Related papers (2020-04-22T12:13:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.