A Survey on Model-based Reinforcement Learning
- URL: http://arxiv.org/abs/2206.09328v1
- Date: Sun, 19 Jun 2022 05:28:03 GMT
- Title: A Survey on Model-based Reinforcement Learning
- Authors: Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu
- Abstract summary: Reinforcement learning (RL) solves sequential decision-making problems via a trial-and-error process interacting with the environment.
Model-based reinforcement learning (MBRL) is believed to be a promising direction, which builds environment models in which the trial-and-errors can take place without real costs.
- Score: 21.85904195671014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) solves sequential decision-making problems via a
trial-and-error process interacting with the environment. While RL achieves
outstanding success in playing complex video games that allow huge
trial-and-error, making errors is always undesired in the real world. To
improve the sample efficiency and thus reduce the errors, model-based
reinforcement learning (MBRL) is believed to be a promising direction, which
builds environment models in which the trial-and-errors can take place without
real costs. In this survey, we take a review of MBRL with a focus on the recent
progress in deep RL. For non-tabular environments, there is always a
generalization error between the learned environment model and the real
environment. As such, it is of great importance to analyze the discrepancy
between policy training in the environment model and that in the real
environment, which in turn guides the algorithm design for better model
learning, model usage, and policy training. Besides, we also discuss the recent
advances of model-based techniques in other forms of RL, including offline RL,
goal-conditioned RL, multi-agent RL, and meta-RL. Moreover, we discuss the
applicability and advantages of MBRL in real-world tasks. Finally, we end this
survey by discussing the promising prospects for the future development of
MBRL. We think that MBRL has great potential and advantages in real-world
applications that were overlooked, and we hope this survey could attract more
research on MBRL.
Related papers
- A Benchmark Environment for Offline Reinforcement Learning in Racing Games [54.83171948184851]
Offline Reinforcement Learning (ORL) is a promising approach to reduce the high sample complexity of traditional Reinforcement Learning (RL)
This paper introduces OfflineMania, a novel environment for ORL research.
It is inspired by the iconic TrackMania series and developed using the Unity 3D game engine.
arXiv Detail & Related papers (2024-07-12T16:44:03Z) - A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning [10.154341066746975]
Model-based Reinforcement Learning (MBRL) aims to make agents more sample-efficient, adaptive, and explainable.
How to best learn the model is still an unresolved question.
arXiv Detail & Related papers (2023-10-10T01:58:38Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - A Validation Tool for Designing Reinforcement Learning Environments [0.0]
This study proposes a Markov-based feature analysis method to validate whether an MDP is well formulated.
We believe an MDP suitable for applying RL should contain a set of state features that are both sensitive to actions and predictive in rewards.
arXiv Detail & Related papers (2021-12-10T13:28:08Z) - Pessimistic Model Selection for Offline Deep Reinforcement Learning [56.282483586473816]
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.
One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
We propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee.
arXiv Detail & Related papers (2021-11-29T06:29:49Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - MOReL : Model-Based Offline Reinforcement Learning [49.30091375141527]
In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment.
We present MOReL, an algorithmic framework for model-based offline RL.
We show that MOReL matches or exceeds state-of-the-art results in widely studied offline RL benchmarks.
arXiv Detail & Related papers (2020-05-12T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.