Discriminator Augmented Model-Based Reinforcement Learning
- URL: http://arxiv.org/abs/2103.12999v1
- Date: Wed, 24 Mar 2021 06:01:55 GMT
- Title: Discriminator Augmented Model-Based Reinforcement Learning
- Authors: Behzad Haghgoo, Allan Zhou, Archit Sharma, Chelsea Finn
- Abstract summary: It is common in practice for the learned model to be inaccurate, impairing planning and leading to poor performance.
This paper aims to improve planning with an importance sampling framework that accounts for discrepancy between the true and learned dynamics.
- Score: 47.094522301093775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By planning through a learned dynamics model, model-based reinforcement
learning (MBRL) offers the prospect of good performance with little environment
interaction. However, it is common in practice for the learned model to be
inaccurate, impairing planning and leading to poor performance. This paper aims
to improve planning with an importance sampling framework that accounts and
corrects for discrepancy between the true and learned dynamics. This framework
also motivates an alternative objective for fitting the dynamics model: to
minimize the variance of value estimation during planning. We derive and
implement this objective, which encourages better prediction on trajectories
with larger returns. We observe empirically that our approach improves the
performance of current MBRL algorithms on two stochastic control problems, and
provide a theoretical basis for our method.
Related papers
- A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning [10.154341066746975]
Model-based Reinforcement Learning (MBRL) aims to make agents more sample-efficient, adaptive, and explainable.
How to best learn the model is still an unresolved question.
arXiv Detail & Related papers (2023-10-10T01:58:38Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Value Gradient weighted Model-Based Reinforcement Learning [28.366157882991565]
Model-based reinforcement learning (MBRL) is a sample efficient technique to obtain control policies.
VaGraM is a novel method for value-aware model learning.
arXiv Detail & Related papers (2022-04-04T13:28:31Z) - Model-Advantage Optimization for Model-Based Reinforcement Learning [41.13567626667456]
Model-based Reinforcement Learning (MBRL) algorithms have been traditionally designed with the goal of learning accurate dynamics of the environment.
Value-aware model learning, an alternative model-learning paradigm to maximum likelihood, proposes to inform model-learning through the value function of the learnt policy.
We propose a novel value-aware objective that is an upper bound on the absolute performance difference of a policy across two models.
arXiv Detail & Related papers (2021-06-26T20:01:28Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - Bridging Imagination and Reality for Model-Based Deep Reinforcement
Learning [72.18725551199842]
We propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD)
It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories.
We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.
arXiv Detail & Related papers (2020-10-23T03:22:01Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Objective Mismatch in Model-based Reinforcement Learning [14.92062504466269]
Model-based reinforcement learning (MBRL) has been shown to be a powerful framework for data-efficiently learning control of continuous tasks.
We identify a fundamental issue of the standard MBRL framework -- what we call the objective mismatch issue.
We propose an initial method to mitigate the mismatch issue by re-weighting dynamics model training.
arXiv Detail & Related papers (2020-02-11T16:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.