Information-Theoretic Safe Exploration with Gaussian Processes
- URL: http://arxiv.org/abs/2212.04914v1
- Date: Fri, 9 Dec 2022 15:23:58 GMT
- Title: Information-Theoretic Safe Exploration with Gaussian Processes
- Authors: Alessandro G. Bottero, Carlos E. Luis, Julia Vinogradska, Felix
Berkenkamp, Jan Peters
- Abstract summary: We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
- Score: 89.31922008981735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a sequential decision making task where we are not allowed to
evaluate parameters that violate an a priori unknown (safety) constraint. A
common approach is to place a Gaussian process prior on the unknown constraint
and allow evaluations only in regions that are safe with high probability. Most
current methods rely on a discretization of the domain and cannot be directly
extended to the continuous case. Moreover, the way in which they exploit
regularity assumptions about the constraint introduces an additional critical
hyperparameter. In this paper, we propose an information-theoretic safe
exploration criterion that directly exploits the GP posterior to identify the
most informative safe parameters to evaluate. Our approach is naturally
applicable to continuous domains and does not require additional
hyperparameters. We theoretically analyze the method and show that we do not
violate the safety constraint with high probability and that we explore by
learning about the constraint up to arbitrary precision. Empirical evaluations
demonstrate improved data-efficiency and scalability.
Related papers
- Information-Theoretic Safe Bayesian Optimization [59.758009422067005]
We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2024-02-23T14:31:10Z) - Online Constraint Tightening in Stochastic Model Predictive Control: A
Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems.
We propose a data-driven approach for learning the constraint-tightening parameters online during control.
Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z) - Kernel Conditional Moment Constraints for Confounding Robust Inference [22.816690686310714]
We study policy evaluation of offline contextual bandits subject to unobserved confounders.
We propose a general estimator that provides a sharp lower bound of the policy value.
arXiv Detail & Related papers (2023-02-26T16:44:13Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Robustness Guarantees for Credal Bayesian Networks via Constraint
Relaxation over Probabilistic Circuits [16.997060715857987]
We develop a method to quantify the robustness of decision functions with respect to credal Bayesian networks.
We show how to obtain a guaranteed upper bound on MARmax in linear time in the size of the circuit.
arXiv Detail & Related papers (2022-05-11T22:37:07Z) - Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for
Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters.
Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error.
Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.