Robot Learning with Crash Constraints
- URL: http://arxiv.org/abs/2010.08669v3
- Date: Thu, 28 Jan 2021 00:34:40 GMT
- Title: Robot Learning with Crash Constraints
- Authors: Alonso Marco, Dominik Baumann, Majid Khadiv, Philipp Hennig, Ludovic
Righetti, Sebastian Trimpe
- Abstract summary: In robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures.
This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted.
We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints.
- Score: 37.685515446816105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the past decade, numerous machine learning algorithms have been shown to
successfully learn optimal policies to control real robotic systems. However,
it is common to encounter failing behaviors as the learning loop progresses.
Specifically, in robot applications where failing is undesired but not
catastrophic, many algorithms struggle with leveraging data obtained from
failures. This is usually caused by (i) the failed experiment ending
prematurely, or (ii) the acquired data being scarce or corrupted. Both
complicate the design of proper reward functions to penalize failures. In this
paper, we propose a framework that addresses those issues. We consider failing
behaviors as those that violate a constraint and address the problem of
learning with crash constraints, where no data is obtained upon constraint
violation. The no-data case is addressed by a novel GP model (GPCR) for the
constraint that combines discrete events (failure/success) with continuous
observations (only obtained upon success). We demonstrate the effectiveness of
our framework on simulated benchmarks and on a real jumping quadruped, where
the constraint threshold is unknown a priori. Experimental data is collected,
by means of constrained Bayesian optimization, directly on the real robot. Our
results outperform manual tuning and GPCR proves useful on estimating the
constraint threshold.
Related papers
- Your Learned Constraint is Secretly a Backward Reachable Tube [27.63547210632307]
We show that ICL recovers the set of states where failure is inevitable, rather than the set of states where failure has already happened.
In contrast to the failure set, the BRT depends on the dynamics of the data collection system.
We discuss the implications of the dynamics-conditionedness of the recovered constraint on both the sample-efficiency of policy search and the transferability of learned constraints.
arXiv Detail & Related papers (2025-01-26T17:54:43Z) - Positive-Unlabeled Constraint Learning for Inferring Nonlinear Continuous Constraints Functions from Expert Demonstrations [8.361428709513476]
Planning for diverse real-world robotic tasks necessitates to know and write all constraints.
This paper presents a novel two-step Positive-Unlabeled Constraint Learning (PUCL) algorithm to infer a continuous constraint function from demonstrations.
It successfully infers the continuous nonlinear constraints and outperforms other baseline methods in terms of constraint accuracy and policy safety.
arXiv Detail & Related papers (2024-08-03T01:09:48Z) - Learning Constraint Network from Demonstrations via Positive-Unlabeled Learning with Memory Replay [8.361428709513476]
This paper presents a positive-unlabeled (PU) learning approach to infer a continuous, arbitrary and possibly nonlinear, constraint from demonstration.
The effectiveness of the proposed method is validated in two Mujoco environments.
arXiv Detail & Related papers (2024-07-23T14:00:18Z) - Truly No-Regret Learning in Constrained MDPs [61.78619476991494]
We propose a model-based primal-dual algorithm to learn in an unknown CMDP.
We prove that our algorithm achieves sublinear regret without error cancellations.
arXiv Detail & Related papers (2024-02-24T09:47:46Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Optimal decision making in robotic assembly and other trial-and-error
tasks [1.0660480034605238]
We study a class of problems providing (1) low-entropy indicators of terminal success / failure, and (2) unreliable (high-entropy) data to predict the final outcome of an ongoing task.
We derive a closed form solution that predicts makespan based on the confusion matrix of the failure predictor.
This allows the robot to learn failure prediction in a production environment, and only adopt a preemptive policy when it actually saves time.
arXiv Detail & Related papers (2023-01-25T22:07:50Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Excursion Search for Constrained Bayesian Optimization under a Limited
Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures.
Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.