Soft Hindsight Experience Replay
- URL: http://arxiv.org/abs/2002.02089v1
- Date: Thu, 6 Feb 2020 03:57:04 GMT
- Title: Soft Hindsight Experience Replay
- Authors: Qiwei He, Liansheng Zhuang, Houqiang Li
- Abstract summary: Soft Hindsight Experience Replay (SHER) is a novel approach based on HER and Maximum Entropy Reinforcement Learning (MERL)
We evaluate SHER on Open AI Robotic manipulation tasks with sparse rewards.
- Score: 77.99182201815763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient learning in the environment with sparse rewards is one of the most
important challenges in Deep Reinforcement Learning (DRL). In continuous DRL
environments such as robotic arms control, Hindsight Experience Replay (HER)
has been shown an effective solution. However, due to the brittleness of
deterministic methods, HER and its variants typically suffer from a major
challenge for stability and convergence, which significantly affects the final
performance. This challenge severely limits the applicability of such methods
to complex real-world domains. To tackle this challenge, in this paper, we
propose Soft Hindsight Experience Replay (SHER), a novel approach based on HER
and Maximum Entropy Reinforcement Learning (MERL), combining the failed
experiences reuse and maximum entropy probabilistic inference model. We
evaluate SHER on Open AI Robotic manipulation tasks with sparse rewards.
Experimental results show that, in contrast to HER and its variants, our
proposed SHER achieves state-of-the-art performance, especially in the
difficult HandManipulation tasks. Furthermore, our SHER method is more stable,
achieving very similar performance across different random seeds.
Related papers
- MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization [91.80034860399677]
Reinforcement learning algorithms aim to balance exploiting the current best strategy with exploring new options that could lead to higher rewards.
We introduce a framework, MaxInfoRL, for balancing intrinsic and extrinsic exploration.
We show that our approach achieves sublinear regret in the simplified setting of multi-armed bandits.
arXiv Detail & Related papers (2024-12-16T18:59:53Z) - Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - RILe: Reinforced Imitation Learning [60.63173816209543]
RILe is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently.
Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - Never Explore Repeatedly in Multi-Agent Reinforcement Learning [40.35950679063337]
We propose a dynamic reward scaling approach to combat "revisitation"
We show enhanced performance in demanding environments like Google Research Football and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2023-08-19T05:27:48Z) - Handling Sparse Rewards in Reinforcement Learning Using Model Predictive
Control [9.118706387430883]
Reinforcement learning (RL) has recently proven great success in various domains.
Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour.
We propose to use model predictive control(MPC) as an experience source for training RL agents in sparse reward environments.
arXiv Detail & Related papers (2022-10-04T11:06:38Z) - USHER: Unbiased Sampling for Hindsight Experience Replay [12.660090786323067]
Dealing with sparse rewards is a long-standing challenge in reinforcement learning (RL)
Hindsight Experience Replay (HER) addresses this problem by reusing failed trajectories for one goal as successful trajectories for another.
This strategy is known to result in a biased value function, as the update rule underestimates the likelihood of bad outcomes in a environment.
We propose anally unbiased importance-based algorithm to address this problem without sacrificing performance on deterministic environments.
arXiv Detail & Related papers (2022-07-03T20:25:06Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - MHER: Model-based Hindsight Experience Replay [33.00149668905828]
We propose Model-based Hindsight Experience Replay (MHER) to solve multi-goal reinforcement learning problems.
replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method.
MHER exploits experiences more efficiently by leveraging environmental dynamics to generate virtual achieved goals.
arXiv Detail & Related papers (2021-07-01T08:52:45Z) - Learning Sparse Rewarded Tasks from Sub-Optimal Demonstrations [78.94386823185724]
Imitation learning learns effectively in sparse-rewarded tasks by leveraging the existing expert demonstrations.
In practice, collecting a sufficient amount of expert demonstrations can be prohibitively expensive.
We propose Self-Adaptive Learning (SAIL) that can achieve (near) optimal performance given only a limited number of sub-optimal demonstrations.
arXiv Detail & Related papers (2020-04-01T15:57:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.