Related papers: Deep Reinforcement Learning with Linear Quadratic Regulator Regions

Deep Reinforcement Learning with Linear Quadratic Regulator Regions

URL: http://arxiv.org/abs/2002.09820v2
Date: Wed, 26 Feb 2020 03:46:10 GMT
Title: Deep Reinforcement Learning with Linear Quadratic Regulator Regions
Authors: Gabriel I. Fernandez, Colin Togashi, Dennis W. Hong, Lin F. Yang
Abstract summary: We propose a novel method that guarantees a stable region of attraction for the output of a policy trained in simulation. We have tested our new method by transferring simulated policies for a swing-up inverted pendulum to real systems and demonstrated its efficacy.
Score: 26.555266610250797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Practitioners often rely on compute-intensive domain randomization to ensure reinforcement learning policies trained in simulation can robustly transfer to the real world. Due to unmodeled nonlinearities in the real system, however, even such simulated policies can still fail to perform stably enough to acquire experience in real environments. In this paper we propose a novel method that guarantees a stable region of attraction for the output of a policy trained in simulation, even for highly nonlinear systems. Our core technique is to use "bias-shifted" neural networks for constructing the controller and training the network in the simulator. The modified neural networks not only capture the nonlinearities of the system but also provably preserve linearity in a certain region of the state space and thus can be tuned to resemble a linear quadratic regulator that is known to be stable for the real system. We have tested our new method by transferring simulated policies for a swing-up inverted pendulum to real systems and demonstrated its efficacy.

Related papers

Neural Fidelity Calibration for Informative Sim-to-Real Adaptation [10.117298045153564]
Deep reinforcement learning can seamlessly transfer agile locomotion and navigation skills from the simulator to real world. However, bridging the sim-to-real gap with domain randomization or adversarial methods often demands expert physics knowledge to ensure policy robustness. We propose Neural Fidelity (NFC), a novel framework that employs conditional score-based diffusion models to calibrate simulator physical coefficients and residual fidelity domains online during robot execution.
arXiv Detail & Related papers (2025-04-11T15:12:12Z)
Reconstructing dynamics from sparse observations with no training on target system [0.0]
The power of the proposed hybrid machine-learning framework is demonstrated using a large number of prototypical nonlinear dynamical systems. The framework provides a paradigm of reconstructing complex and nonlinear dynamics in the extreme situation where training data does not exist and the observations are random and sparse.
arXiv Detail & Related papers (2024-10-28T17:05:04Z)
Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks. We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks. In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z)
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation [67.63756749551924]
Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control. Lyapunov stability guarantees over the region-of-attraction (ROA) for NN controllers with nonlinear dynamical systems are challenging to obtain. We demonstrate a new framework for learning NN controllers together with Lyapunov certificates using fast empirical falsification and strategic regularizations.
arXiv Detail & Related papers (2024-04-11T17:49:15Z)
Differentially Flat Learning-based Model Predictive Control Using a Stability, State, and Input Constraining Safety Filter [10.52705437098686]
Learning-based optimal control algorithms control unknown systems using past trajectory data and a learned model of the system dynamics. We present a novel nonlinear controller that exploits differential flatness to achieve similar performance to state-of-the-art learning-based controllers.
arXiv Detail & Related papers (2023-07-20T02:42:23Z)
In-Distribution Barrier Functions: Self-Supervised Policy Filters that Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z)
Adaptive Robust Model Predictive Control via Uncertainty Cancellation [25.736296938185074]
We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics. We optimize over a class of nonlinear feedback policies inspired by certainty equivalent "estimate-and-cancel" control laws.
arXiv Detail & Related papers (2022-12-02T18:54:23Z)
Learning over All Stabilizing Nonlinear Controllers for a Partially-Observed Linear System [4.3012765978447565]
We propose a parameterization of nonlinear output feedback controllers for linear dynamical systems. Our approach guarantees the closed-loop stability of partially observable linear dynamical systems without requiring any constraints to be satisfied.
arXiv Detail & Related papers (2021-12-08T10:43:47Z)
Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty [28.10549712956161]
We propose a learning-based robust predictive control algorithm that can handle large uncertainty in the dynamics for a class of discrete-time systems. Motivated by an inability of existing learning-based predictive control algorithms to achieve safety guarantees in the presence of uncertainties of large magnitude, we achieve significant performance improvements.
arXiv Detail & Related papers (2021-04-16T17:47:02Z)
Controlling nonlinear dynamical systems into arbitrary states using machine learning [77.34726150561087]
We propose a novel and fully data driven control scheme which relies on machine learning (ML) Exploiting recently developed ML-based prediction capabilities of complex systems, we demonstrate that nonlinear systems can be forced to stay in arbitrary dynamical target states coming from any initial state. Having this highly flexible control scheme with little demands on the amount of required data on hand, we briefly discuss possible applications that range from engineering to medicine.
arXiv Detail & Related papers (2021-02-23T16:58:26Z)
Active Learning for Nonlinear System Identification with Guarantees [102.43355665393067]
We study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs. We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data. We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.
arXiv Detail & Related papers (2020-06-18T04:54:11Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.