Contact Energy Based Hindsight Experience Prioritization
- URL: http://arxiv.org/abs/2312.02677v2
- Date: Fri, 23 Feb 2024 14:30:57 GMT
- Title: Contact Energy Based Hindsight Experience Prioritization
- Authors: Erdi Sayar, Zhenshan Bing, Carlo D'Eramo, Ozgur S. Oguz, Alois Knoll
- Abstract summary: Multi-goal robot manipulation tasks with sparse rewards are difficult for reinforcement learning (RL) algorithms.
Recent algorithms such as Hindsight Experience Replay (HER) expedite learning by taking advantage of failed trajectories.
We propose a novel approach Contact Energy Based Prioritization(CEBP) to select the samples from the replay buffer based on rich information due to contact.
- Score: 19.42106651692228
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Multi-goal robot manipulation tasks with sparse rewards are difficult for
reinforcement learning (RL) algorithms due to the inefficiency in collecting
successful experiences. Recent algorithms such as Hindsight Experience Replay
(HER) expedite learning by taking advantage of failed trajectories and
replacing the desired goal with one of the achieved states so that any failed
trajectory can be utilized as a contribution to learning. However, HER
uniformly chooses failed trajectories, without taking into account which ones
might be the most valuable for learning. In this paper, we address this problem
and propose a novel approach Contact Energy Based Prioritization~(CEBP) to
select the samples from the replay buffer based on rich information due to
contact, leveraging the touch sensors in the gripper of the robot and object
displacement. Our prioritization scheme favors sampling of contact-rich
experiences, which are arguably the ones providing the largest amount of
information. We evaluate our proposed approach on various sparse reward robotic
tasks and compare them with the state-of-the-art methods. We show that our
method surpasses or performs on par with those methods on robot manipulation
tasks. Finally, we deploy the trained policy from our method to a real Franka
robot for a pick-and-place task. We observe that the robot can solve the task
successfully. The videos and code are publicly available at:
https://erdiphd.github.io/HER_force
Related papers
- Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Few-Shot Preference Learning for Human-in-the-Loop RL [13.773589150740898]
Motivated by the success of meta-learning, we pre-train preference models on prior task data and quickly adapt them for new tasks using only a handful of queries.
We reduce the amount of online feedback needed to train manipulation policies in Meta-World by 20$times$, and demonstrate the effectiveness of our method on a real Franka Panda Robot.
arXiv Detail & Related papers (2022-12-06T23:12:26Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - Robot Learning of Mobile Manipulation with Reachability Behavior Priors [38.49783454634775]
Mobile Manipulation (MM) systems are ideal candidates for taking up the role of a personal assistant in unstructured real-world environments.
Among other challenges, MM requires effective coordination of the robot's embodiments for executing tasks that require both mobility and manipulation.
We study the integration of robotic reachability priors in actor-critic RL methods for accelerating the learning of MM for reaching and fetching tasks.
arXiv Detail & Related papers (2022-03-08T12:44:42Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Diversity-based Trajectory and Goal Selection with Hindsight Experience
Replay [8.259694128526112]
We propose diversity-based trajectory and goal selection with HER (DTGSH)
We show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.
arXiv Detail & Related papers (2021-08-17T21:34:24Z) - Actionable Models: Unsupervised Offline Reinforcement Learning of
Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.
We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - Reward Engineering for Object Pick and Place Training [3.4806267677524896]
We have used the Pick and Place environment provided by OpenAI's Gym to engineer rewards.
In the default configuration of the OpenAI baseline and environment the reward function is calculated using the distance between the target location and the robot end-effector.
We were also able to introduce certain user desired trajectories in the learnt policies.
arXiv Detail & Related papers (2020-01-11T20:13:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.