Physics-informed Dyna-Style Model-Based Deep Reinforcement Learning for
Dynamic Control
- URL: http://arxiv.org/abs/2108.00128v1
- Date: Sat, 31 Jul 2021 02:19:36 GMT
- Title: Physics-informed Dyna-Style Model-Based Deep Reinforcement Learning for
Dynamic Control
- Authors: Xin-Yang Liu and Jian-Xun Wang
- Abstract summary: We propose to leverage the prior knowledge of underlying physics of the environment, where the governing laws are (partially) known.
By incorporating the prior information of the environment, the quality of the learned model can be notably improved.
- Score: 1.8275108630751844
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model-based reinforcement learning (MBRL) is believed to have much higher
sample efficiency compared to model-free algorithms by learning a predictive
model of the environment. However, the performance of MBRL highly relies on the
quality of the learned model, which is usually built in a black-box manner and
may have poor predictive accuracy outside of the data distribution. The
deficiencies of the learned model may prevent the policy from being fully
optimized. Although some uncertainty analysis-based remedies have been proposed
to alleviate this issue, model bias still poses a great challenge for MBRL. In
this work, we propose to leverage the prior knowledge of underlying physics of
the environment, where the governing laws are (partially) known. In particular,
we developed a physics-informed MBRL framework, where governing equations and
physical constraints are utilized to inform the model learning and policy
search. By incorporating the prior information of the environment, the quality
of the learned model can be notably improved, while the required interactions
with the environment are significantly reduced, leading to better sample
efficiency and learning performance. The effectiveness and merit have been
demonstrated over a handful of classic control problems, where the environments
are governed by canonical ordinary/partial differential equations.
Related papers
- HarmonyDream: Task Harmonization Inside World Models [93.07314830304193]
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning.
We propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization.
arXiv Detail & Related papers (2023-09-30T11:38:13Z) - Physics-Informed Model-Based Reinforcement Learning [19.01626581411011]
One of the drawbacks of traditional reinforcement learning algorithms is their poor sample efficiency.
We learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy.
We show that, in model-based RL, model accuracy mainly matters in environments that are sensitive to initial conditions.
We also show that, in challenging environments, physics-informed model-based RL achieves better average-return than state-of-the-art model-free RL algorithms.
arXiv Detail & Related papers (2022-12-05T11:26:10Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - A Unified Framework for Alternating Offline Model Training and Policy
Learning [62.19209005400561]
In offline model-based reinforcement learning, we learn a dynamic model from historically collected data, and utilize the learned model and fixed datasets for policy learning.
We develop an iterative offline MBRL framework, where we maximize a lower bound of the true expected return.
With the proposed unified model-policy learning framework, we achieve competitive performance on a wide range of continuous-control offline reinforcement learning datasets.
arXiv Detail & Related papers (2022-10-12T04:58:51Z) - Should Models Be Accurate? [14.044354912031864]
We focus our investigations on Dyna-style planning in a prediction setting.
We introduce a meta-learning algorithm for training models with a focus on their usefulness to the learner instead of their accuracy in modelling the environment.
Our experiments show that our algorithm enables faster learning than even using an accurate model built with domain-specific knowledge of the non-stationarity.
arXiv Detail & Related papers (2022-05-22T04:23:54Z) - Value Gradient weighted Model-Based Reinforcement Learning [28.366157882991565]
Model-based reinforcement learning (MBRL) is a sample efficient technique to obtain control policies.
VaGraM is a novel method for value-aware model learning.
arXiv Detail & Related papers (2022-04-04T13:28:31Z) - Model-Advantage Optimization for Model-Based Reinforcement Learning [41.13567626667456]
Model-based Reinforcement Learning (MBRL) algorithms have been traditionally designed with the goal of learning accurate dynamics of the environment.
Value-aware model learning, an alternative model-learning paradigm to maximum likelihood, proposes to inform model-learning through the value function of the learnt policy.
We propose a novel value-aware objective that is an upper bound on the absolute performance difference of a policy across two models.
arXiv Detail & Related papers (2021-06-26T20:01:28Z) - Discriminator Augmented Model-Based Reinforcement Learning [47.094522301093775]
It is common in practice for the learned model to be inaccurate, impairing planning and leading to poor performance.
This paper aims to improve planning with an importance sampling framework that accounts for discrepancy between the true and learned dynamics.
arXiv Detail & Related papers (2021-03-24T06:01:55Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.