Safely Learning Dynamical Systems
- URL: http://arxiv.org/abs/2305.12284v2
- Date: Sat, 8 Jun 2024 17:22:24 GMT
- Title: Safely Learning Dynamical Systems
- Authors: Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu,
- Abstract summary: A fundamental challenge in learning an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety.
We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize trajectories.
We show how one can safely collect trajectories and fit a model of the nonlinear dynamics that is consistent with the initial uncertainty set.
- Score: 13.598953823625472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A fundamental challenge in learning an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety. We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize trajectories. The state of the system must stay within a safety region for a horizon of $T$ time steps under the action of all dynamical systems that (i) belong to a given initial uncertainty set, and (ii) are consistent with information gathered so far. First, we consider safely learning a linear dynamical system involving $n$ states. For the case $T=1$, we present an LP-based algorithm that either safely recovers the true dynamics from at most $n$ trajectories, or certifies that safe learning is impossible. For $T=2$, we give an SDP representation of the set of safe initial conditions and show that $\lceil n/2 \rceil$ trajectories generically suffice for safe learning. For $T = \infty$, we provide SDP-representable inner approximations of the set of safe initial conditions and show that one trajectory generically suffices for safe learning. We extend a number of our results to the cases where the initial uncertainty set contains sparse, low-rank, or permutation matrices, or when the system has a control input. Second, we consider safely learning a general class of nonlinear dynamical systems. For the case $T=1$, we give an SOCP-based representation of the set of safe initial conditions. For $T=\infty$, we provide semidefinite representable inner approximations to the set of safe initial conditions. We show how one can safely collect trajectories and fit a polynomial model of the nonlinear dynamics that is consistent with the initial uncertainty set and best agrees with the observations. We also present some extensions to cases where the measurements are noisy or the dynamical system involves disturbances.
Related papers
- Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs [63.47351876442425]
We study episodic linear mixture MDPs with the unknown transition and adversarial rewards under full-information feedback.
We propose a novel algorithm that combines the benefits of two popular methods: occupancy-measure-based and policy-based.
Our algorithm enjoys an $widetildemathcalO(d sqrtH3 K + sqrtHK(H + barP_K$)$ dynamic regret, where $d$ is the feature dimension.
arXiv Detail & Related papers (2024-11-05T13:55:52Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Learning the Dynamics of Autonomous Linear Systems From Multiple
Trajectories [2.2268031040603447]
Existing results on learning rate and consistency of autonomous linear system identification rely on observations of steady state behaviors from a single long trajectory.
We consider the scenario of learning system dynamics based on multiple short trajectories, where there are no easily observed steady state behaviors.
We show that one can adjust the length of the trajectories to achieve a learning rate of $mathcalO(sqrtfraclogNN)$ for strictly stable systems and a learning rate of $mathcalO(frac(logN)dsqr
arXiv Detail & Related papers (2022-03-24T01:29:53Z) - Safe Adaptive Learning-based Control for Constrained Linear Quadratic
Regulators with Regret Guarantees [11.627320138064684]
We study the adaptive control of an unknown linear system with a quadratic cost function subject to safety constraints on both the states and actions.
Our algorithm is implemented on a single trajectory and does not require system restarts.
arXiv Detail & Related papers (2021-10-31T05:52:42Z) - Safe Reinforcement Learning with Linear Function Approximation [48.75026009895308]
We introduce safety as an unknown linear cost function of states and actions, which must always fall below a certain threshold.
We then present algorithms, termed SLUCB-QVI and RSLUCB-QVI, for episodic Markov decision processes (MDPs) with linear function approximation.
We show that SLUCB-QVI and RSLUCB-QVI, while with emphno safety violation, achieve a $tildemathcalOleft(kappasqrtd3H3Tright)$ regret, nearly matching
arXiv Detail & Related papers (2021-06-11T08:46:57Z) - Safely Learning Dynamical Systems from Short Trajectories [12.184674552836414]
A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety.
We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory.
We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible.
arXiv Detail & Related papers (2020-11-24T18:06:10Z) - Reinforcement Learning with Fast Stabilization in Linear Dynamical
Systems [91.43582419264763]
We study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems.
We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment.
We show that the proposed algorithm attains $tildemathcalO(sqrtT)$ regret after $T$ time steps of agent-environment interaction.
arXiv Detail & Related papers (2020-07-23T23:06:40Z) - Sparse Identification of Nonlinear Dynamical Systems via Reweighted
$\ell_1$-regularized Least Squares [62.997667081978825]
This work proposes an iterative sparse-regularized regression method to recover governing equations of nonlinear systems from noisy state measurements.
The aim of this work is to improve the accuracy and robustness of the method in the presence of state measurement noise.
arXiv Detail & Related papers (2020-05-27T08:30:15Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems [34.394552166070746]
We study gradient based algorithms to learn the system dynamics $theta$ from samples obtained from a single finite trajectory.
Unlike existing work, our bounds are noise sensitive which allows for learning ground-truth dynamics with high accuracy and small sample complexity.
arXiv Detail & Related papers (2020-02-20T02:36:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.