Related papers: Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

URL: http://arxiv.org/abs/2005.04374v3
Date: Tue, 27 Oct 2020 19:46:23 GMT
Title: Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems
Authors: Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Anima Anandkumar, Yisong Yue, and Soon-Jo Chung
Abstract summary: Learning-based control algorithms require data collection with abundant supervision for training. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
Score: 81.7983463275447
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning-based control algorithms require data collection with abundant supervision for training. Safe exploration algorithms ensure the safety of this data collection process even when only partial knowledge is available. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained stochastic optimal control with dynamics learning and feedback control. We derive an iterative convex optimization algorithm that solves an \underline{Info}rmation-cost \underline{S}tochastic \underline{N}onlinear \underline{O}ptimal \underline{C}ontrol problem (Info-SNOC). The optimization objective encodes control cost for performance and exploration cost for learning, and the safety is incorporated as distributionally robust chance constraints. The dynamics are predicted from a robust regression model that is learned from data. The Info-SNOC algorithm is used to compute a sub-optimal pool of safe motion plans that aid in exploration for learning unknown residual dynamics under safety constraints. A stable feedback controller is used to execute the motion plan and collect data for model learning. We prove the safety of rollout from our exploration method and reduction in uncertainty over epochs, thereby guaranteeing the consistency of our learning method. We validate the effectiveness of Info-SNOC by designing and implementing a pool of safe trajectories for a planar robot. We demonstrate that our approach has higher success rate in ensuring safety when compared to a deterministic trajectory optimization approach.

Related papers

Offline Safe Reinforcement Learning Using Trajectory Classification [21.956407710821416]
We learn a policy that generates desirable trajectories and avoids undesirable trajectories. We extensively evaluate our method using the DSRL benchmark for offline safe RL.
arXiv Detail & Related papers (2024-12-19T22:29:03Z)
Safe and Stable Closed-Loop Learning for Neural-Network-Supported Model Predictive Control [0.0]
We consider safe learning of parametrized predictive controllers that operate with incomplete information about the underlying process. Our method focuses on the system's overall long-term performance in closed-loop while keeping it safe and stable. We explicitly incorporated stability information in the Bayesian-optimization-based learning procedure, thereby achieving rigorous probabilistic safety guarantees.
arXiv Detail & Related papers (2024-09-16T11:03:58Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm [4.128216503196621]
We propose an On-policy Model-based Safe Deep RL algorithm in which we learn the transition dynamics of the environment in an online manner. We show that our algorithm is more sample efficient and results in lower cumulative hazard violations as compared to constrained model-free approaches.
arXiv Detail & Related papers (2022-10-14T06:53:02Z)
Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity. As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner. On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions [35.9713619595494]
Reinforcement Learning and continuous nonlinear control have been successfully deployed in multiple domains of complicated sequential decision-making tasks. Given the exploration nature of the learning process and the presence of model uncertainty, it is challenging to apply them to safety-critical control tasks. We propose a emphprovably efficient episodic safe learning framework for online control tasks.
arXiv Detail & Related papers (2022-07-29T00:54:35Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs) We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system. To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z)
Neural Lyapunov Model Predictive Control: Learning Safe Global Controllers from Sub-optimal Examples [4.777323087050061]
In many real-world and industrial applications, it is typical to have an existing control strategy, for instance, execution from a human operator. The objective of this work is to improve upon this unknown, safe but suboptimal policy by learning a new controller that retains safety and stability. The proposed algorithm alternatively learns the terminal cost and updates the MPC parameters according to a stability metric.
arXiv Detail & Related papers (2020-02-21T16:57:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.