Excursion Search for Constrained Bayesian Optimization under a Limited
Budget of Failures
- URL: http://arxiv.org/abs/2005.07443v1
- Date: Fri, 15 May 2020 09:54:09 GMT
- Title: Excursion Search for Constrained Bayesian Optimization under a Limited
Budget of Failures
- Authors: Alonso Marco, Alexander von Rohr, Dominik Baumann, Jos\'e Miguel
Hern\'andez-Lobato and Sebastian Trimpe
- Abstract summary: We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures.
Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
- Score: 62.41541049302712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When learning to ride a bike, a child falls down a number of times before
achieving the first success. As falling down usually has only mild
consequences, it can be seen as a tolerable failure in exchange for a faster
learning process, as it provides rich information about an undesired behavior.
In the context of Bayesian optimization under unknown constraints (BOC),
typical strategies for safe learning explore conservatively and avoid failures
by all means. On the other side of the spectrum, non conservative BOC
algorithms that allow failing may fail an unbounded number of times before
reaching the optimum. In this work, we propose a novel decision maker grounded
in control theory that controls the amount of risk we allow in the search as a
function of a given budget of failures. Empirical validation shows that our
algorithm uses the failures budget more efficiently in a variety of
optimization experiments, and generally achieves lower regret, than
state-of-the-art methods. In addition, we propose an original algorithm for
unconstrained Bayesian optimization inspired by the notion of excursion sets in
stochastic processes, upon which the failures-aware algorithm is built.
Related papers
- Truly No-Regret Learning in Constrained MDPs [61.78619476991494]
We propose a model-based primal-dual algorithm to learn in an unknown CMDP.
We prove that our algorithm achieves sublinear regret without error cancellations.
arXiv Detail & Related papers (2024-02-24T09:47:46Z) - Self-adaptive algorithms for quasiconvex programming and applications to
machine learning [0.0]
We provide a self-adaptive step-size strategy that does not include convex line-search techniques and a generic approach under mild assumptions.
The proposed method is verified by preliminary results from some computational examples.
To demonstrate the effectiveness of the proposed technique for large-scale problems, we apply it to some experiments on machine learning.
arXiv Detail & Related papers (2022-12-13T05:30:29Z) - Adapting to Misspecification in Contextual Bandits [82.55565343668246]
We introduce a new family of oracle-efficient algorithms for $varepsilon$-misspecified contextual bandits.
We obtain the first algorithm that achieves the optimal $O(dsqrtT + varepsilonsqrtdT)$ regret bound for unknown misspecification level.
arXiv Detail & Related papers (2021-07-12T21:30:41Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Lenient Regret and Good-Action Identification in Gaussian Process
Bandits [43.03669155559218]
We study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough"
On the practical side, we consider the problem of finding a single "good action" according to a known pre-specified threshold, and introduce several good-action identification algorithms that exploit knowledge of the threshold.
arXiv Detail & Related papers (2021-02-11T01:16:58Z) - Experimental Design for Regret Minimization in Linear Bandits [19.8309784360219]
We propose a novel design-based algorithm to minimize regret in online linear and bandits.
We provide state-of-the-art finite time regret guarantees and show that our algorithm can be applied in both the bandit and semi-bandit feedback regime.
arXiv Detail & Related papers (2020-11-01T17:59:19Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Learning and Planning in Average-Reward Markov Decision Processes [15.586087060535398]
We introduce learning and planning algorithms for average-reward MDPs.
All of our algorithms are based on using the temporal-difference error rather than the conventional error when updating the estimate of the average reward.
arXiv Detail & Related papers (2020-06-29T19:03:24Z) - Corruption-Tolerant Gaussian Process Bandit Optimization [130.60115798580136]
We consider the problem of optimizing an unknown (typically non-producing) function with a bounded norm.
We introduce an algorithm based on Fast-Slow GP-UCB based on "fast but non-robust" and "slow"
We argue that certain dependencies cannot be required depending on the corruption level.
arXiv Detail & Related papers (2020-03-04T09:46:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.