Learning Constraints from Locally-Optimal Demonstrations under Cost
Function Uncertainty
- URL: http://arxiv.org/abs/2001.09336v1
- Date: Sat, 25 Jan 2020 15:57:48 GMT
- Title: Learning Constraints from Locally-Optimal Demonstrations under Cost
Function Uncertainty
- Authors: Glen Chou, Necmiye Ozay, Dmitry Berenson
- Abstract summary: We present an algorithm for learning parametric constraints from locally-optimal demonstrations, where the cost function being optimized is uncertain to the learner.
Our method uses the Karush-Kuhn-Tucker (KKT) optimality conditions of the demonstrations within a mixed integer linear program (MILP) to learn constraints which are consistent with the local optimality of the demonstrations.
We evaluate our method on high-dimensional constraints and systems by learning constraints for 7-DOF arm and quadrotor examples, show that it outperforms competing constraint-learning approaches, and can be effectively used to plan new constraint-satisfying trajectories in the environment
- Score: 6.950510860295866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an algorithm for learning parametric constraints from
locally-optimal demonstrations, where the cost function being optimized is
uncertain to the learner. Our method uses the Karush-Kuhn-Tucker (KKT)
optimality conditions of the demonstrations within a mixed integer linear
program (MILP) to learn constraints which are consistent with the local
optimality of the demonstrations, by either using a known constraint
parameterization or by incrementally growing a parameterization that is
consistent with the demonstrations. We provide theoretical guarantees on the
conservativeness of the recovered safe/unsafe sets and analyze the limits of
constraint learnability when using locally-optimal demonstrations. We evaluate
our method on high-dimensional constraints and systems by learning constraints
for 7-DOF arm and quadrotor examples, show that it outperforms competing
constraint-learning approaches, and can be effectively used to plan new
constraint-satisfying trajectories in the environment.
Related papers
- Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes [4.229902091180109]
We propose a novel, stability-certified IRL approach to learning control Lyapunov functions from demonstrations data.
By exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs.
We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data.
arXiv Detail & Related papers (2024-05-14T16:40:45Z) - OTClean: Data Cleaning for Conditional Independence Violations using
Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints.
We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z) - Online Constraint Tightening in Stochastic Model Predictive Control: A
Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems.
We propose a data-driven approach for learning the constraint-tightening parameters online during control.
Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Gaussian Process Constraint Learning for Scalable Chance-Constrained
Motion Planning from Demonstrations [7.079021327958753]
We propose a method for learning constraints represented as Gaussian processes (GPs) from locally-optimal demonstrations.
We demonstrate our method can learn complex, nonlinear constraints demonstrated on a 5D nonholonomic car, a 12D quadrotor, and a 3-link planar arm.
Our results suggest the learned GP constraint is accurate, outperforming previous constraint learning methods that require more a priori knowledge.
arXiv Detail & Related papers (2021-12-08T22:47:58Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Uncertainty-Aware Constraint Learning for Adaptive Safe Motion Planning
from Demonstrations [6.950510860295866]
We present a method for learning to satisfy uncertain constraints from demonstrations.
Our method uses robust optimization to obtain a belief over the potentially infinite set of possible constraints consistent with the demonstrations.
We derive guarantees on the accuracy of our constraint belief and probabilistic guarantees on plan safety.
arXiv Detail & Related papers (2020-11-09T01:59:14Z) - Constrained Model-based Reinforcement Learning with Robust Cross-Entropy
Method [30.407700996710023]
This paper studies the constrained/safe reinforcement learning problem with sparse indicator signals for constraint violations.
We employ the neural network ensemble model to estimate the prediction uncertainty and use model predictive control as the basic control framework.
The results show that our approach learns to complete the tasks with a much smaller number of constraint violations than state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-15T18:19:35Z) - Responsive Safety in Reinforcement Learning by PID Lagrangian Methods [74.49173841304474]
Lagrangian methods exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior.
We propose a novel Lagrange multiplier update method that utilizes derivatives of the constraint function.
We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark.
arXiv Detail & Related papers (2020-07-08T08:43:14Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z) - Teaching the Old Dog New Tricks: Supervised Learning with Constraints [18.88930622054883]
Adding constraint support in Machine Learning has the potential to address outstanding issues in data-driven AI systems.
Existing approaches typically apply constrained optimization techniques to ML training, enforce constraint satisfaction by adjusting the model design, or use constraints to correct the output.
Here, we investigate a different, complementary, strategy based on "teaching" constraint satisfaction to a supervised ML method via the direct use of a state-of-the-art constraint solver.
arXiv Detail & Related papers (2020-02-25T09:47:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.