MHER: Model-based Hindsight Experience Replay
- URL: http://arxiv.org/abs/2107.00306v1
- Date: Thu, 1 Jul 2021 08:52:45 GMT
- Title: MHER: Model-based Hindsight Experience Replay
- Authors: Rui Yang, Meng Fang, Lei Han, Yali Du, Feng Luo, Xiu Li
- Abstract summary: We propose Model-based Hindsight Experience Replay (MHER) to solve multi-goal reinforcement learning problems.
replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method.
MHER exploits experiences more efficiently by leveraging environmental dynamics to generate virtual achieved goals.
- Score: 33.00149668905828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving multi-goal reinforcement learning (RL) problems with sparse rewards
is generally challenging. Existing approaches have utilized goal relabeling on
collected experiences to alleviate issues raised from sparse rewards. However,
these methods are still limited in efficiency and cannot make full use of
experiences. In this paper, we propose Model-based Hindsight Experience Replay
(MHER), which exploits experiences more efficiently by leveraging environmental
dynamics to generate virtual achieved goals. Replacing original goals with
virtual goals generated from interaction with a trained dynamics model leads to
a novel relabeling method, \emph{model-based relabeling} (MBR). Based on MBR,
MHER performs both reinforcement learning and supervised learning for efficient
policy improvement. Theoretically, we also prove the supervised part in MHER,
i.e., goal-conditioned supervised learning with MBR data, optimizes a lower
bound on the multi-goal RL objective. Experimental results in several
point-based tasks and simulated robotics environments show that MHER achieves
significantly higher sample efficiency than previous state-of-the-art methods.
Related papers
- Efficient Diversity-based Experience Replay for Deep Reinforcement Learning [14.96744975805832]
This paper proposes a novel approach, diversity-based experience replay (DBER), which leverages the deterministic point process to prioritize diverse samples in state realizations.
We conducted extensive experiments on Robotic Manipulation tasks in MuJoCo, Atari games, and realistic in-door environments in Habitat.
arXiv Detail & Related papers (2024-10-27T15:51:27Z) - MRHER: Model-based Relay Hindsight Experience Replay for Sequential Object Manipulation Tasks with Sparse Rewards [11.79027801942033]
We propose a novel model-based RL framework called Model-based Relay Hindsight Experience Replay (MRHER)
MRHER breaks down a continuous task into subtasks with increasing complexity and utilizes the previous subtask to guide the learning of the subsequent one.
We show that MRHER exhibits state-of-the-art sample efficiency in benchmark tasks, outperforming RHER by 13.79% and 14.29%.
arXiv Detail & Related papers (2023-06-28T09:51:25Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - CostNet: An End-to-End Framework for Goal-Directed Reinforcement
Learning [9.432068833600884]
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment.
There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines.
This paper introduces a novel reinforcement learning algorithm for predicting the distance between two states in a Markov Decision Process.
arXiv Detail & Related papers (2022-10-03T21:16:14Z) - Simplifying Model-based RL: Learning Representations, Latent-space
Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent.
We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z) - ReIL: A Framework for Reinforced Intervention-based Imitation Learning [3.0846824529023387]
We introduce Reinforced Intervention-based Learning (ReIL), a framework consisting of a general intervention-based learning algorithm and a multi-task imitation learning model.
Experimental results from real world mobile robot navigation challenges indicate that ReIL learns rapidly from sparse supervisor corrections without suffering deterioration in performance.
arXiv Detail & Related papers (2022-03-29T09:30:26Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Imaginary Hindsight Experience Replay: Curious Model-based Learning for
Sparse Reward Tasks [9.078290260836706]
We propose a model-based method tailored for sparse-reward tasks that foregoes the need for complicated reward engineering.
This approach, termed Imaginary Hindsight Experience Replay, minimises real-world interactions by incorporating imaginary data into policy updates.
Upon evaluation, this approach provides an order of magnitude increase in data-efficiency on average versus the state-of-the-art model-free method in the benchmark OpenAI Gym Fetch Robotics tasks.
arXiv Detail & Related papers (2021-10-05T23:38:31Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Soft Hindsight Experience Replay [77.99182201815763]
Soft Hindsight Experience Replay (SHER) is a novel approach based on HER and Maximum Entropy Reinforcement Learning (MERL)
We evaluate SHER on Open AI Robotic manipulation tasks with sparse rewards.
arXiv Detail & Related papers (2020-02-06T03:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.