Learning a model is paramount for sample efficiency in reinforcement
learning control of PDEs
- URL: http://arxiv.org/abs/2302.07160v1
- Date: Tue, 14 Feb 2023 16:14:39 GMT
- Title: Learning a model is paramount for sample efficiency in reinforcement
learning control of PDEs
- Authors: Stefan Werner and Sebastian Peitz
- Abstract summary: We show that learning an actuated model in parallel to training the RL agent significantly reduces the total amount of required data sampled from the real system.
We also show that iteratively updating the model is of major importance to avoid biases in the RL training.
- Score: 5.488334211013093
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of this paper is to make a strong point for the usage of dynamical
models when using reinforcement learning (RL) for feedback control of dynamical
systems governed by partial differential equations (PDEs). To breach the gap
between the immense promises we see in RL and the applicability in complex
engineering systems, the main challenges are the massive requirements in terms
of the training data, as well as the lack of performance guarantees. We present
a solution for the first issue using a data-driven surrogate model in the form
of a convolutional LSTM with actuation. We demonstrate that learning an
actuated model in parallel to training the RL agent significantly reduces the
total amount of required data sampled from the real system. Furthermore, we
show that iteratively updating the model is of major importance to avoid biases
in the RL training. Detailed ablation studies reveal the most important
ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky
equation do demonstarte our findings.
Related papers
- Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control [1.5361702135159845]
This paper introduces a knowledge-informed model-based residual reinforcement learning framework.
It integrates traffic expert knowledge into a virtual environment model, employing the Intelligent Driver Model (IDM) for basic dynamics and neural networks for residual dynamics.
We propose a novel strategy that combines traditional control methods with residual RL, facilitating efficient learning and policy optimization without the need to learn from scratch.
arXiv Detail & Related papers (2024-08-30T16:16:57Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Deep autoregressive density nets vs neural ensembles for model-based
offline reinforcement learning [2.9158689853305693]
We consider a model-based reinforcement learning algorithm that infers the system dynamics from the available data and performs policy optimization on imaginary model rollouts.
This approach is vulnerable to exploiting model errors which can lead to catastrophic failures on the real system.
We show that better performance can be obtained with a single well-calibrated autoregressive model on the D4RL benchmark.
arXiv Detail & Related papers (2024-02-05T10:18:15Z) - Simplified Temporal Consistency Reinforcement Learning [19.814047499837084]
We show that a simple representation learning approach relying on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL.
Our approach outperforms model-free methods by a large margin and matches model-based methods' sample efficiency while training 2.4 times faster.
arXiv Detail & Related papers (2023-06-15T19:37:43Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - MoDem: Accelerating Visual Model-Based Reinforcement Learning with
Demonstrations [36.44386146801296]
Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications.
We find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL.
We empirically study three complex visuo-motor control domains and find that our method is 150%-250% more successful in completing sparse reward tasks.
arXiv Detail & Related papers (2022-12-12T04:28:50Z) - Reinforcement Learning as One Big Sequence Modeling Problem [84.84564880157149]
Reinforcement learning (RL) is typically concerned with estimating single-step policies or single-step models.
We view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
arXiv Detail & Related papers (2021-06-03T17:58:51Z) - Learning to Reweight Imaginary Transitions for Model-Based Reinforcement
Learning [58.66067369294337]
When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions.
We adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories.
Our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks.
arXiv Detail & Related papers (2021-04-09T03:13:35Z) - Model-Free Voltage Regulation of Unbalanced Distribution Network Based
on Surrogate Model and Deep Reinforcement Learning [9.984416150031217]
This paper develops a model-free approach based on the surrogate model and deep reinforcement learning (DRL)
We have also extended it to deal with unbalanced three-phase scenarios.
arXiv Detail & Related papers (2020-06-24T18:49:41Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.