Combining Model-Based and Model-Free Methods for Nonlinear Control: A
Provably Convergent Policy Gradient Approach
- URL: http://arxiv.org/abs/2006.07476v1
- Date: Fri, 12 Jun 2020 21:16:29 GMT
- Title: Combining Model-Based and Model-Free Methods for Nonlinear Control: A
Provably Convergent Policy Gradient Approach
- Authors: Guannan Qu, Chenkai Yu, Steven Low, Adam Wierman
- Abstract summary: We develop a novel approach to use the linear model to define a warm start for a model-free, policy method.
We show this hybrid approach outperforms the model-based controller while avoiding the convergence issues associated with model-free approaches.
- Score: 10.648049177775686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-free learning-based control methods have seen great success recently.
However, such methods typically suffer from poor sample complexity and limited
convergence guarantees. This is in sharp contrast to classical model-based
control, which has a rich theory but typically requires strong modeling
assumptions. In this paper, we combine the two approaches to achieve the best
of both worlds. We consider a dynamical system with both linear and non-linear
components and develop a novel approach to use the linear model to define a
warm start for a model-free, policy gradient method. We show this hybrid
approach outperforms the model-based controller while avoiding the convergence
issues associated with model-free approaches via both numerical experiments and
theoretical analyses, in which we derive sufficient conditions on the
non-linear component such that our approach is guaranteed to converge to the
(nearly) global optimal controller.
Related papers
- Model-Free Active Exploration in Reinforcement Learning [53.786439742572995]
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution.
Our strategy is able to identify efficient policies faster than state-of-the-art exploration approaches.
arXiv Detail & Related papers (2024-06-30T19:00:49Z) - Data-driven Nonlinear Model Reduction using Koopman Theory: Integrated
Control Form and NMPC Case Study [56.283944756315066]
We propose generic model structures combining delay-coordinate encoding of measurements and full-state decoding to integrate reduced Koopman modeling and state estimation.
A case study demonstrates that our approach provides accurate control models and enables real-time capable nonlinear model predictive control of a high-purity cryogenic distillation column.
arXiv Detail & Related papers (2024-01-09T11:54:54Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Model-Based Reinforcement Learning via Stochastic Hybrid Models [39.83837705993256]
This paper adopts a hybrid-system view of nonlinear modeling and control.
We consider a sequence modeling paradigm that captures the temporal structure of the data.
We show that these time-series models naturally admit a closed-loop extension that we use to extract local feedback controllers.
arXiv Detail & Related papers (2021-11-11T14:05:46Z) - Combining Gaussian processes and polynomial chaos expansions for
stochastic nonlinear model predictive control [0.0]
We introduce a new algorithm to explicitly consider time-invariant uncertainties in optimal control problems.
The main novelty in this paper is to use this combination in an efficient fashion to obtain mean and variance estimates of nonlinear transformations.
It is shown how to formulate both chance-constraints and a probabilistic objective for the optimal control problem.
arXiv Detail & Related papers (2021-03-09T14:25:08Z) - COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable.
We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions.
We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z) - Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a
Finite Horizon [3.867363075280544]
We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem.
We produce a global linear convergence guarantee for the setting of finite time horizon and state dynamics under weak assumptions.
We show results for the case where we assume a model for the underlying dynamics and where we apply the method to the data directly.
arXiv Detail & Related papers (2020-11-20T09:51:49Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.