Affordance Learning from Play for Sample-Efficient Policy Learning
- URL: http://arxiv.org/abs/2203.00352v1
- Date: Tue, 1 Mar 2022 11:00:35 GMT
- Title: Affordance Learning from Play for Sample-Efficient Policy Learning
- Authors: Jessica Borja-Diaz, Oier Mees, Gabriel Kalweit, Lukas Hermann, Joschka
Boedecker, Wolfram Burgard
- Abstract summary: We use a self-supervised visual affordance model from human teleoperated play data to enable efficient policy learning and motion planning.
We combine model-based planning with model-free deep reinforcement learning to learn policies that favor the same object regions favored by people.
We find that our policies train 4x faster than the baselines and generalize better to novel objects because our visual affordance model can anticipate their affordance regions.
- Score: 30.701546777177555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robots operating in human-centered environments should have the ability to
understand how objects function: what can be done with each object, where this
interaction may occur, and how the object is used to achieve a goal. To this
end, we propose a novel approach that extracts a self-supervised visual
affordance model from human teleoperated play data and leverages it to enable
efficient policy learning and motion planning. We combine model-based planning
with model-free deep reinforcement learning (RL) to learn policies that favor
the same object regions favored by people, while requiring minimal robot
interactions with the environment. We evaluate our algorithm, Visual
Affordance-guided Policy Optimization (VAPO), with both diverse simulation
manipulation tasks and real world robot tidy-up experiments to demonstrate the
effectiveness of our affordance-guided policies. We find that our policies
train 4x faster than the baselines and generalize better to novel objects
because our visual affordance model can anticipate their affordance regions.
Related papers
- Learning Goal-oriented Bimanual Dough Rolling Using Dynamic Heterogeneous Graph Based on Human Demonstration [19.74767906744719]
Soft object manipulation poses significant challenges for robots, requiring effective techniques for state representation and manipulation policy learning.
This research paper introduces a novel approach: a dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies.
arXiv Detail & Related papers (2024-10-15T16:12:00Z) - Learning active tactile perception through belief-space control [21.708391958446274]
We propose a method that autonomously learns tactile exploration policies by developing a generative world model.
We evaluate our method on three simulated tasks where the goal is to estimate a desired object property.
We find that our method is able to discover policies that efficiently gather information about the desired property in an intuitive manner.
arXiv Detail & Related papers (2023-11-30T21:54:42Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.
Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.
Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Model-Based Visual Planning with Self-Supervised Functional Distances [104.83979811803466]
We present a self-supervised method for model-based visual goal reaching.
Our approach learns entirely using offline, unlabeled data.
We find that this approach substantially outperforms both model-free and model-based prior methods.
arXiv Detail & Related papers (2020-12-30T23:59:09Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning
with Intrinsic-Extrinsic Modeling [33.89793938441333]
We present a novel policy learning paradigm for the object search task, based on hierarchical and interpretable modeling with an intrinsic-extrinsic reward setting.
Experiments conducted on the House3D environment validate and show that the robot, trained with our model, can perform the object search task in a more optimal and interpretable way.
arXiv Detail & Related papers (2020-10-16T19:21:38Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.