Related papers: Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models

Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models

URL: http://arxiv.org/abs/2205.05787v1
Date: Wed, 11 May 2022 22:03:18 GMT
Title: Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models
Authors: Zhongyu Li, Jun Zeng, Akshay Thirugnanam, Koushil Sreenath
Abstract summary: We propose a new method to combine model-based safety with model-free reinforcement learning. We show that a low-dimensional dynamical model is sufficient to capture the dynamics of the closed-loop system. We illustrate that the found linear model is able to provide guarantees by safety-critical optimal control framework.
Score: 16.511440197186918
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bridging model-based safety and model-free reinforcement learning (RL) for dynamic robots is appealing since model-based methods are able to provide formal safety guarantees, while RL-based methods are able to exploit the robot agility by learning from the full-order system dynamics. However, current approaches to tackle this problem are mostly restricted to simple systems. In this paper, we propose a new method to combine model-based safety with model-free reinforcement learning by explicitly finding a low-dimensional model of the system controlled by a RL policy and applying stability and safety guarantees on that simple model. We use a complex bipedal robot Cassie, which is a high dimensional nonlinear system with hybrid dynamics and underactuation, and its RL-based walking controller as an example. We show that a low-dimensional dynamical model is sufficient to capture the dynamics of the closed-loop system. We demonstrate that this model is linear, asymptotically stable, and is decoupled across control input in all dimensions. We further exemplify that such linearity exists even when using different RL control policies. Such results point out an interesting direction to understand the relationship between RL and optimal control: whether RL tends to linearize the nonlinear system during training in some cases. Furthermore, we illustrate that the found linear model is able to provide guarantees by safety-critical optimal control framework, e.g., Model Predictive Control with Control Barrier Functions, on an example of autonomous navigation using Cassie while taking advantage of the agility provided by the RL-based controller.

Related papers

Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control [0.0]
This study proposes a new robust control approach using the framework of deep reinforcement learning (DRL) The problem setup is modeled via the latent Markov decision process (LMDP), a set of vanilla MDPs, for a controlled system subject to uncertainties and nonlinearities. Compared to traditional DRL-based controls, the proposed controller design is smarter in that we can achieve a high level of generalization ability.
arXiv Detail & Related papers (2025-04-28T12:09:07Z)
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator [50.191655141020505]
Reinforcement Learning (RL) has demonstrated impressive capabilities in robotic control but remains challenging due to high sample complexity, safety concerns, and the sim-to-real gap. We introduce Offline Robotic World Model (RWM-O), a model-based approach that explicitly estimates uncertainty to improve policy learning without reliance on a physics simulator.
arXiv Detail & Related papers (2025-04-23T12:58:15Z)
Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback [16.46487826869775]
We propose a novel framework, Neural Internal Model Control, which integrates model-based control with RL-based control to enhance robustness. Our framework streamlines the predictive model by applying Newton-Euler equations for rigid-body dynamics, eliminating the need to capture complex high-dimensional nonlinearities. We demonstrate the effectiveness of our framework on both quadrotors and quadrupedal robots, achieving superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-11-20T07:07:42Z)
Learning Exactly Linearizable Deep Dynamics Models [0.07366405857677226]
We propose a learning method for exactly linearizable dynamical models that can easily apply various control theories to ensure stability, reliability, etc. The proposed model is employed for the real-time control of an automotive engine, and the results demonstrate good predictive performance and stable control under constraints.
arXiv Detail & Related papers (2023-11-30T05:40:55Z)
In-Distribution Barrier Functions: Self-Supervised Policy Filters that Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z)
Neural Abstractions [72.42530499990028]
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics. We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models.
arXiv Detail & Related papers (2023-01-27T12:38:09Z)
Efficient Learning of Voltage Control Strategies via Model-based Deep Reinforcement Learning [9.936452412191326]
This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time. We propose a novel model-based-DRL framework where a deep neural network (DNN)-based dynamic surrogate model is utilized with the policy learning framework.
arXiv Detail & Related papers (2022-12-06T02:50:53Z)
Model-Based Reinforcement Learning with SINDy [0.0]
We propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL) We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories than state of the art model learning algorithms.
arXiv Detail & Related papers (2022-08-30T19:03:48Z)
Bridging the Model-Reality Gap with Lipschitz Network Adaptation [22.499090318313662]
As robots venture into the real world, they are subject to unmodeled dynamics and disturbances. Traditional model-based control approaches have been proven successful in relatively static and known operating environments. We propose a method that bridges the model-reality gap and enables the application of model-based approaches even if dynamic uncertainties are present.
arXiv Detail & Related papers (2021-12-07T15:12:49Z)
Reinforcement Learning as One Big Sequence Modeling Problem [84.84564880157149]
Reinforcement learning (RL) is typically concerned with estimating single-step policies or single-step models. We view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
arXiv Detail & Related papers (2021-06-03T17:58:51Z)
Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions [96.63967125746747]
Reinforcement learning framework learns the model uncertainty present in the CBF and CLF constraints. RL-CBF-CLF-QP addresses the problem of model uncertainty in the safety constraints.
arXiv Detail & Related papers (2020-04-16T10:51:33Z)
Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty. LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.