Time-varying Gaussian Process Bandit Optimization with Non-constant
Evaluation Time
- URL: http://arxiv.org/abs/2003.04691v2
- Date: Wed, 11 Mar 2020 00:38:08 GMT
- Title: Time-varying Gaussian Process Bandit Optimization with Non-constant
Evaluation Time
- Authors: Hideaki Imamura, Nontawat Charoenphakdee, Futoshi Futami, Issei Sato,
Junya Honda, Masashi Sugiyama
- Abstract summary: We propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time.
Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem.
- Score: 93.6788993843846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Gaussian process bandit is a problem in which we want to find a maximizer
of a black-box function with the minimum number of function evaluations. If the
black-box function varies with time, then time-varying Bayesian optimization is
a promising framework. However, a drawback with current methods is in the
assumption that the evaluation time for every observation is constant, which
can be unrealistic for many practical applications, e.g., recommender systems
and environmental monitoring. As a result, the performance of current methods
can be degraded when this assumption is violated. To cope with this problem, we
propose a novel time-varying Bayesian optimization algorithm that can
effectively handle the non-constant evaluation time. Furthermore, we
theoretically establish a regret bound of our algorithm. Our bound elucidates
that a pattern of the evaluation time sequence can hugely affect the difficulty
of the problem. We also provide experimental results to validate the practical
effectiveness of the proposed method.
Related papers
- Time-Varying Gaussian Process Bandits with Unknown Prior [18.93478528448966]
PE-GP-UCB is capable of solving time-varying Bayesian optimisation problems.
It relies on the fact that either the observed function values are consistent with some of the priors.
arXiv Detail & Related papers (2024-02-02T18:52:16Z) - Primal-Dual Contextual Bayesian Optimization for Control System Online
Optimization with Time-Average Constraints [21.38692458445459]
This paper studies the problem of online performance optimization of constrained closed-loop control systems.
A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal solution.
arXiv Detail & Related papers (2023-04-12T18:37:52Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - Bayesian Optimization under Stochastic Delayed Feedback [36.16843889404038]
Existing BO methods assume that the function evaluation (feedback) is available to the learner immediately or after a fixed delay.
We propose algorithms with sub-linear regret guarantees that efficiently address the dilemma of selecting new function queries while waiting for delayed feedback.
arXiv Detail & Related papers (2022-06-19T07:34:08Z) - Predictor-corrector algorithms for stochastic optimization under gradual
distribution shift [26.897316325189212]
Time-varying optimization problems frequently arise in machine learning practice.
We exploit this underlying continuity by developing predictor-corrector algorithms for time-varying optimizations.
arXiv Detail & Related papers (2022-05-26T18:33:00Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - A Framework for Sample Efficient Interval Estimation with Control
Variates [94.32811054797148]
We consider the problem of estimating confidence intervals for the mean of a random variable.
Under certain conditions, we show improved efficiency compared to existing estimation algorithms.
arXiv Detail & Related papers (2020-06-18T05:42:30Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Optimizing for the Future in Non-Stationary MDPs [52.373873622008944]
We present a policy gradient algorithm that maximizes a forecast of future performance.
We show that our algorithm, called Prognosticator, is more robust to non-stationarity than two online adaptation techniques.
arXiv Detail & Related papers (2020-05-17T03:41:19Z) - Active Learning for Identification of Linear Dynamical Systems [12.056495277232118]
We show a finite time bound estimation rate our algorithm attains.
We analyze several examples where our algorithm provably improves over rates obtained by playing noise.
arXiv Detail & Related papers (2020-02-02T21:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.