Gaussian Process Constraint Learning for Scalable Chance-Constrained
Motion Planning from Demonstrations
- URL: http://arxiv.org/abs/2112.04612v1
- Date: Wed, 8 Dec 2021 22:47:58 GMT
- Title: Gaussian Process Constraint Learning for Scalable Chance-Constrained
Motion Planning from Demonstrations
- Authors: Glen Chou, Hao Wang, Dmitry Berenson
- Abstract summary: We propose a method for learning constraints represented as Gaussian processes (GPs) from locally-optimal demonstrations.
We demonstrate our method can learn complex, nonlinear constraints demonstrated on a 5D nonholonomic car, a 12D quadrotor, and a 3-link planar arm.
Our results suggest the learned GP constraint is accurate, outperforming previous constraint learning methods that require more a priori knowledge.
- Score: 7.079021327958753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a method for learning constraints represented as Gaussian
processes (GPs) from locally-optimal demonstrations. Our approach uses the
Karush-Kuhn-Tucker (KKT) optimality conditions to determine where on the
demonstrations the constraint is tight, and a scaling of the constraint
gradient at those states. We then train a GP representation of the constraint
which is consistent with and which generalizes this information. We further
show that the GP uncertainty can be used within a kinodynamic RRT to plan
probabilistically-safe trajectories, and that we can exploit the GP structure
within the planner to exactly achieve a specified safety probability. We
demonstrate our method can learn complex, nonlinear constraints demonstrated on
a 5D nonholonomic car, a 12D quadrotor, and a 3-link planar arm, all while
requiring minimal prior information on the constraint. Our results suggest the
learned GP constraint is accurate, outperforming previous constraint learning
methods that require more a priori knowledge.
Related papers
- Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Model-based Safe Reinforcement Learning using Generalized Control
Barrier Function [6.556257209888797]
This paper proposes a model-based feasibility enhancement technique of constrained RL.
By using the model information, the policy can be optimized safely without violating actual safety constraints.
The proposed method achieves up to four times fewer constraint violations and converges 3.36 times faster than baseline constrained RL approaches.
arXiv Detail & Related papers (2021-03-02T08:17:38Z) - Separated Proportional-Integral Lagrangian for Chance Constrained
Reinforcement Learning [6.600423613245076]
Safety is essential for reinforcement learning applied in real-world tasks like autonomous driving.
Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements.
Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic oscillations or cannot satisfy the constraints.
arXiv Detail & Related papers (2021-02-17T02:40:01Z) - Uncertainty-Aware Constraint Learning for Adaptive Safe Motion Planning
from Demonstrations [6.950510860295866]
We present a method for learning to satisfy uncertain constraints from demonstrations.
Our method uses robust optimization to obtain a belief over the potentially infinite set of possible constraints consistent with the demonstrations.
We derive guarantees on the accuracy of our constraint belief and probabilistic guarantees on plan safety.
arXiv Detail & Related papers (2020-11-09T01:59:14Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Responsive Safety in Reinforcement Learning by PID Lagrangian Methods [74.49173841304474]
Lagrangian methods exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior.
We propose a novel Lagrange multiplier update method that utilizes derivatives of the constraint function.
We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark.
arXiv Detail & Related papers (2020-07-08T08:43:14Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z) - Teaching the Old Dog New Tricks: Supervised Learning with Constraints [18.88930622054883]
Adding constraint support in Machine Learning has the potential to address outstanding issues in data-driven AI systems.
Existing approaches typically apply constrained optimization techniques to ML training, enforce constraint satisfaction by adjusting the model design, or use constraints to correct the output.
Here, we investigate a different, complementary, strategy based on "teaching" constraint satisfaction to a supervised ML method via the direct use of a state-of-the-art constraint solver.
arXiv Detail & Related papers (2020-02-25T09:47:39Z) - Learning Constraints from Locally-Optimal Demonstrations under Cost
Function Uncertainty [6.950510860295866]
We present an algorithm for learning parametric constraints from locally-optimal demonstrations, where the cost function being optimized is uncertain to the learner.
Our method uses the Karush-Kuhn-Tucker (KKT) optimality conditions of the demonstrations within a mixed integer linear program (MILP) to learn constraints which are consistent with the local optimality of the demonstrations.
We evaluate our method on high-dimensional constraints and systems by learning constraints for 7-DOF arm and quadrotor examples, show that it outperforms competing constraint-learning approaches, and can be effectively used to plan new constraint-satisfying trajectories in the environment
arXiv Detail & Related papers (2020-01-25T15:57:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.