Related papers: Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures

Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures

URL: http://arxiv.org/abs/2005.07443v1
Date: Fri, 15 May 2020 09:54:09 GMT
Title: Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures
Authors: Alonso Marco, Alexander von Rohr, Dominik Baumann, Jos\'e Miguel Hern\'andez-Lobato and Sebastian Trimpe
Abstract summary: We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
Score: 62.41541049302712
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When learning to ride a bike, a child falls down a number of times before achieving the first success. As falling down usually has only mild consequences, it can be seen as a tolerable failure in exchange for a faster learning process, as it provides rich information about an undesired behavior. In the context of Bayesian optimization under unknown constraints (BOC), typical strategies for safe learning explore conservatively and avoid failures by all means. On the other side of the spectrum, non conservative BOC algorithms that allow failing may fail an unbounded number of times before reaching the optimum. In this work, we propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Empirical validation shows that our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods. In addition, we propose an original algorithm for unconstrained Bayesian optimization inspired by the notion of excursion sets in stochastic processes, upon which the failures-aware algorithm is built.

Related papers

Surrogate-based optimization of system architectures subject to hidden constraints [0.0]
This work investigates strategies for satisfying hidden constraints in Surrogate-Based Optimization (SBO) algorithms. Three high-level strategies are identified: rejection of failed points from the training set, replacing failed points based on viable (non-failed) points, and predicting the failure region. The developed BO algorithm and used test problems are available in the open-source Python library SBArchOpt.
arXiv Detail & Related papers (2025-04-11T17:35:58Z)
Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control [0.0]
Traditional Bayesian optimization approaches treat the learning problem as a black box, ignoring valuable information and knowledge about the structure of the underlying problem. We propose a time-series-informed optimization framework that incorporates intermediate performance evaluations from early iterations of each experimental episode into the learning procedure. We show that our approach achieves baseline performance with approximately half the resources and outperforms the baseline in terms of final closed-loop performance.
arXiv Detail & Related papers (2024-12-03T12:38:53Z)
Truly No-Regret Learning in Constrained MDPs [61.78619476991494]
We propose a model-based primal-dual algorithm to learn in an unknown CMDP. We prove that our algorithm achieves sublinear regret without error cancellations.
arXiv Detail & Related papers (2024-02-24T09:47:46Z)
Bandit Social Learning: Exploration under Myopic Behavior [54.767961587919075]
We study social learning dynamics motivated by reviews on online platforms. Agents collectively follow a simple multi-armed bandit protocol, but each agent acts myopically, without regards to exploration. We derive stark learning failures for any such behavior, and provide matching positive results.
arXiv Detail & Related papers (2023-02-15T01:57:57Z)
Self-adaptive algorithms for quasiconvex programming and applications to machine learning [0.0]
We provide a self-adaptive step-size strategy that does not include convex line-search techniques and a generic approach under mild assumptions. The proposed method is verified by preliminary results from some computational examples. To demonstrate the effectiveness of the proposed technique for large-scale problems, we apply it to some experiments on machine learning.
arXiv Detail & Related papers (2022-12-13T05:30:29Z)
Adapting to Misspecification in Contextual Bandits [82.55565343668246]
We introduce a new family of oracle-efficient algorithms for $varepsilon$-misspecified contextual bandits. We obtain the first algorithm that achieves the optimal $O(dsqrtT + varepsilonsqrtdT)$ regret bound for unknown misspecification level.
arXiv Detail & Related papers (2021-07-12T21:30:41Z)
Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise. First-order guarantees are relatively well understood in statistical and online learning. We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z)
Lenient Regret and Good-Action Identification in Gaussian Process Bandits [43.03669155559218]
We study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough" On the practical side, we consider the problem of finding a single "good action" according to a known pre-specified threshold, and introduce several good-action identification algorithms that exploit knowledge of the threshold.
arXiv Detail & Related papers (2021-02-11T01:16:58Z)
Experimental Design for Regret Minimization in Linear Bandits [19.8309784360219]
We propose a novel design-based algorithm to minimize regret in online linear and bandits. We provide state-of-the-art finite time regret guarantees and show that our algorithm can be applied in both the bandit and semi-bandit feedback regime.
arXiv Detail & Related papers (2020-11-01T17:59:19Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
Learning and Planning in Average-Reward Markov Decision Processes [15.586087060535398]
We introduce learning and planning algorithms for average-reward MDPs. All of our algorithms are based on using the temporal-difference error rather than the conventional error when updating the estimate of the average reward.
arXiv Detail & Related papers (2020-06-29T19:03:24Z)
Corruption-Tolerant Gaussian Process Bandit Optimization [130.60115798580136]
We consider the problem of optimizing an unknown (typically non-producing) function with a bounded norm. We introduce an algorithm based on Fast-Slow GP-UCB based on "fast but non-robust" and "slow" We argue that certain dependencies cannot be required depending on the corruption level.
arXiv Detail & Related papers (2020-03-04T09:46:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.