Related papers: Information-Theoretic Safe Exploration with Gaussian Processes

Information-Theoretic Safe Exploration with Gaussian Processes

URL: http://arxiv.org/abs/2212.04914v1
Date: Fri, 9 Dec 2022 15:23:58 GMT
Title: Information-Theoretic Safe Exploration with Gaussian Processes
Authors: Alessandro G. Bottero, Carlos E. Luis, Julia Vinogradska, Felix Berkenkamp, Jan Peters
Abstract summary: We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an unknown (safety) constraint. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
Score: 89.31922008981735
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an a priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown constraint and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. Our approach is naturally applicable to continuous domains and does not require additional hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we explore by learning about the constraint up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.

Related papers

Safety in safe Bayesian optimization and its ramifications for control [6.450289319821615]
In control engineering, parameters of a pre-designed controller are often tuned online in feedback with a plant. In particular, machine learning methods have been deployed for this important problem, in particular, Bayesian optimization (BO) We identify two significant obstacles to practical safety. First, SafeOpt-type algorithms rely on quantitative uncertainty bounds, and most implementations replace these by theoretically unsupporteds. We propose Lipschitz-only Safe Bayesian Optimization (LoSBO), a safe BO algorithm that relies only on a known Lipschitz bound for its safety.
arXiv Detail & Related papers (2025-01-23T14:24:11Z)
Information-Theoretic Safe Bayesian Optimization [59.758009422067005]
We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an unknown (safety) constraint. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2024-02-23T14:31:10Z)
Online Constraint Tightening in Stochastic Model Predictive Control: A Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems. We propose a data-driven approach for learning the constraint-tightening parameters online during control. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z)
Kernel Conditional Moment Constraints for Confounding Robust Inference [22.816690686310714]
We study policy evaluation of offline contextual bandits subject to unobserved confounders. We propose a general estimator that provides a sharp lower bound of the policy value.
arXiv Detail & Related papers (2023-02-26T16:44:13Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Robustness Guarantees for Credal Bayesian Networks via Constraint Relaxation over Probabilistic Circuits [16.997060715857987]
We develop a method to quantify the robustness of decision functions with respect to credal Bayesian networks. We show how to obtain a guaranteed upper bound on MARmax in linear time in the size of the circuit.
arXiv Detail & Related papers (2022-05-11T22:37:07Z)
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters. Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error. Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z)
Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression [91.3373131262391]
Uncertainty is the only certainty there is. Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions. How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.