Relay Hindsight Experience Replay: Continual Reinforcement Learning for
Robot Manipulation Tasks with Sparse Rewards
- URL: http://arxiv.org/abs/2208.00843v1
- Date: Mon, 1 Aug 2022 13:30:01 GMT
- Title: Relay Hindsight Experience Replay: Continual Reinforcement Learning for
Robot Manipulation Tasks with Sparse Rewards
- Authors: Yongle Luo, Yuxin Wang, Kun Dong, Qiang Zhang, Erkang Cheng, Zhiyong
Sun and Bo Song
- Abstract summary: We propose a novel model-free continual RL algorithm, called Relay-HER (RHER)
The proposed method first decomposes and rearranges the original long-horizon task into new sub-tasks with incremental complexity.
The experimental results indicate the significant improvements in sample efficiency of RHER compared to vanilla-HER in five typical robot manipulation tasks.
- Score: 26.998587654269873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning with sparse rewards is usually inefficient in Reinforcement Learning
(RL). Hindsight Experience Replay (HER) has been shown an effective solution to
handle the low sample efficiency that results from sparse rewards by goal
relabeling. However, the HER still has an implicit virtual-positive sparse
reward problem caused by invariant achieved goals, especially for robot
manipulation tasks. To solve this problem, we propose a novel model-free
continual RL algorithm, called Relay-HER (RHER). The proposed method first
decomposes and rearranges the original long-horizon task into new sub-tasks
with incremental complexity. Subsequently, a multi-task network is designed to
learn the sub-tasks in ascending order of complexity. To solve the
virtual-positive sparse reward problem, we propose a Random-Mixed Exploration
Strategy (RMES), in which the achieved goals of the sub-task with higher
complexity are quickly changed under the guidance of the one with lower
complexity. The experimental results indicate the significant improvements in
sample efficiency of RHER compared to vanilla-HER in five typical robot
manipulation tasks, including Push, PickAndPlace, Drawer, Insert, and
ObstaclePush. The proposed RHER method has also been applied to learn a
contact-rich push task on a physical robot from scratch, and the success rate
reached 10/10 with only 250 episodes.
Related papers
- REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - MRHER: Model-based Relay Hindsight Experience Replay for Sequential Object Manipulation Tasks with Sparse Rewards [11.79027801942033]
We propose a novel model-based RL framework called Model-based Relay Hindsight Experience Replay (MRHER)
MRHER breaks down a continuous task into subtasks with increasing complexity and utilizes the previous subtask to guide the learning of the subsequent one.
We show that MRHER exhibits state-of-the-art sample efficiency in benchmark tasks, outperforming RHER by 13.79% and 14.29%.
arXiv Detail & Related papers (2023-06-28T09:51:25Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Reinforcement learning with Demonstrations from Mismatched Task under
Sparse Reward [7.51772160511614]
Reinforcement learning often suffer from the sparse reward issue in real-world robotics problems.
Prior works often assume that the learning agent and the expert aim to accomplish the same task, which requires collecting new data for every new task.
In this paper, we consider the case where the target task is mismatched from but similar with that of the expert.
Existing LfD methods can not effectively guide learning in mismatched new tasks with sparse rewards.
arXiv Detail & Related papers (2022-12-03T02:24:59Z) - New Tight Relaxations of Rank Minimization for Multi-Task Learning [161.23314844751556]
We propose two novel multi-task learning formulations based on two regularization terms.
We show that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.
arXiv Detail & Related papers (2021-12-09T07:29:57Z) - MHER: Model-based Hindsight Experience Replay [33.00149668905828]
We propose Model-based Hindsight Experience Replay (MHER) to solve multi-goal reinforcement learning problems.
replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method.
MHER exploits experiences more efficiently by leveraging environmental dynamics to generate virtual achieved goals.
arXiv Detail & Related papers (2021-07-01T08:52:45Z) - Reset-Free Reinforcement Learning via Multi-Task Learning: Learning
Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems.
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z) - Soft Hindsight Experience Replay [77.99182201815763]
Soft Hindsight Experience Replay (SHER) is a novel approach based on HER and Maximum Entropy Reinforcement Learning (MERL)
We evaluate SHER on Open AI Robotic manipulation tasks with sparse rewards.
arXiv Detail & Related papers (2020-02-06T03:57:04Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.