Bridging Model-based Safety and Model-free Reinforcement Learning
through System Identification of Low Dimensional Linear Models
- URL: http://arxiv.org/abs/2205.05787v1
- Date: Wed, 11 May 2022 22:03:18 GMT
- Title: Bridging Model-based Safety and Model-free Reinforcement Learning
through System Identification of Low Dimensional Linear Models
- Authors: Zhongyu Li, Jun Zeng, Akshay Thirugnanam, Koushil Sreenath
- Abstract summary: We propose a new method to combine model-based safety with model-free reinforcement learning.
We show that a low-dimensional dynamical model is sufficient to capture the dynamics of the closed-loop system.
We illustrate that the found linear model is able to provide guarantees by safety-critical optimal control framework.
- Score: 16.511440197186918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bridging model-based safety and model-free reinforcement learning (RL) for
dynamic robots is appealing since model-based methods are able to provide
formal safety guarantees, while RL-based methods are able to exploit the robot
agility by learning from the full-order system dynamics. However, current
approaches to tackle this problem are mostly restricted to simple systems. In
this paper, we propose a new method to combine model-based safety with
model-free reinforcement learning by explicitly finding a low-dimensional model
of the system controlled by a RL policy and applying stability and safety
guarantees on that simple model. We use a complex bipedal robot Cassie, which
is a high dimensional nonlinear system with hybrid dynamics and underactuation,
and its RL-based walking controller as an example. We show that a
low-dimensional dynamical model is sufficient to capture the dynamics of the
closed-loop system. We demonstrate that this model is linear, asymptotically
stable, and is decoupled across control input in all dimensions. We further
exemplify that such linearity exists even when using different RL control
policies. Such results point out an interesting direction to understand the
relationship between RL and optimal control: whether RL tends to linearize the
nonlinear system during training in some cases. Furthermore, we illustrate that
the found linear model is able to provide guarantees by safety-critical optimal
control framework, e.g., Model Predictive Control with Control Barrier
Functions, on an example of autonomous navigation using Cassie while taking
advantage of the agility provided by the RL-based controller.
Related papers
- Learning Exactly Linearizable Deep Dynamics Models [0.07366405857677226]
We propose a learning method for exactly linearizable dynamical models that can easily apply various control theories to ensure stability, reliability, etc.
The proposed model is employed for the real-time control of an automotive engine, and the results demonstrate good predictive performance and stable control under constraints.
arXiv Detail & Related papers (2023-11-30T05:40:55Z) - In-Distribution Barrier Functions: Self-Supervised Policy Filters that
Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations.
Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z) - Neural Abstractions [72.42530499990028]
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics.
We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models.
arXiv Detail & Related papers (2023-01-27T12:38:09Z) - Efficient Learning of Voltage Control Strategies via Model-based Deep
Reinforcement Learning [9.936452412191326]
This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems.
Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time.
We propose a novel model-based-DRL framework where a deep neural network (DNN)-based dynamic surrogate model is utilized with the policy learning framework.
arXiv Detail & Related papers (2022-12-06T02:50:53Z) - Model-Based Reinforcement Learning with SINDy [0.0]
We propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL)
We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories than state of the art model learning algorithms.
arXiv Detail & Related papers (2022-08-30T19:03:48Z) - Bridging the Model-Reality Gap with Lipschitz Network Adaptation [22.499090318313662]
As robots venture into the real world, they are subject to unmodeled dynamics and disturbances.
Traditional model-based control approaches have been proven successful in relatively static and known operating environments.
We propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present.
arXiv Detail & Related papers (2021-12-07T15:12:49Z) - Reinforcement Learning as One Big Sequence Modeling Problem [84.84564880157149]
Reinforcement learning (RL) is typically concerned with estimating single-step policies or single-step models.
We view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
arXiv Detail & Related papers (2021-06-03T17:58:51Z) - Two-step reinforcement learning for model-free redesign of nonlinear
optimal regulator [1.5624421399300306]
Reinforcement learning (RL) is one of the promising approaches that enable model-free redesign of optimal controllers for nonlinear dynamical systems.
We propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems.
arXiv Detail & Related papers (2021-03-05T17:12:33Z) - Reinforcement Learning for Safety-Critical Control under Model
Uncertainty, using Control Lyapunov Functions and Control Barrier Functions [96.63967125746747]
Reinforcement learning framework learns the model uncertainty present in the CBF and CLF constraints.
RL-CBF-CLF-QP addresses the problem of model uncertainty in the safety constraints.
arXiv Detail & Related papers (2020-04-16T10:51:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.