Weighted Gaussian Process Bandits for Non-stationary Environments
- URL: http://arxiv.org/abs/2107.02371v1
- Date: Tue, 6 Jul 2021 03:37:33 GMT
- Title: Weighted Gaussian Process Bandits for Non-stationary Environments
- Authors: Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta,
Ness Shroff
- Abstract summary: We develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.
A key challenge is how to cope with infinite-dimensional feature maps.
We provide universal upper bounds and weight-dependent upper bounds for weighted maximum information gains.
- Score: 30.49357960656324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider the Gaussian process (GP) bandit optimization
problem in a non-stationary environment. To capture external changes, the
black-box function is allowed to be time-varying within a reproducing kernel
Hilbert space (RKHS). To this end, we develop WGP-UCB, a novel UCB-type
algorithm based on weighted Gaussian process regression. A key challenge is how
to cope with infinite-dimensional feature maps. To that end, we leverage kernel
approximation techniques to prove a sublinear regret bound, which is the first
(frequentist) sublinear regret guarantee on weighted time-varying bandits with
general nonlinear rewards. This result generalizes both non-stationary linear
bandits and standard GP-UCB algorithms. Further, a novel concentration
inequality is achieved for weighted Gaussian process regression with general
weights. We also provide universal upper bounds and weight-dependent upper
bounds for weighted maximum information gains. These results are potentially of
independent interest for applications such as news ranking and adaptive
pricing, where weights can be adopted to capture the importance or quality of
data. Finally, we conduct experiments to highlight the favorable gains of the
proposed algorithm in many cases when compared to existing methods.
Related papers
- Calibrated Computation-Aware Gaussian Processes [1.1470070927586018]
We propose a new CAGP framework, CAGP-GS, based on using Gauss-Seidel iterations for the underlying probabilistic linear solver.
We test the calibratedness on a synthetic problem, and compare the performance to existing approaches on a large-scale global temperature regression problem.
arXiv Detail & Related papers (2024-10-11T13:30:08Z) - Variance-Dependent Regret Bounds for Non-stationary Linear Bandits [52.872628573907434]
We propose algorithms that utilize the variance of the reward distribution as well as the $B_K$, and show that they can achieve tighter regret upper bounds.
We introduce two novel algorithms: Restarted Weighted$textOFUL+$ and Restarted $textSAVE+$.
Notably, when the total variance $V_K$ is much smaller than $K$, our algorithms outperform previous state-of-the-art results on non-stationary linear bandits under different settings.
arXiv Detail & Related papers (2024-03-15T23:36:55Z) - On the Sublinear Regret of GP-UCB [58.25014663727544]
We show that the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm enjoys nearly optimal regret rates.
Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel.
arXiv Detail & Related papers (2023-07-14T13:56:11Z) - Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem.
We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS)
We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - No-Regret Algorithms for Time-Varying Bayesian Optimization [0.0]
We adopt the general variation budget model to capture the time-varying environment.
We introduce two GP-UCB type algorithms: R-GP-UCB and SW-GP-UCB, respectively.
Our results not only recover previous linear bandit results when a linear kernel is used, but complement the previous regret analysis of time-varying Gaussian process bandit.
arXiv Detail & Related papers (2021-02-11T22:35:32Z) - Robust Gaussian Process Regression Based on Iterative Trimming [6.912744078749024]
This paper presents a new robust GP regression algorithm that iteratively trims the most extreme data points.
It can greatly improve the model accuracy for contaminated data even in the presence of extreme or abundant outliers.
As a practical example in the astrophysical study, we show that this method can precisely determine the main-sequence ridge line in the color-magnitude diagram of star clusters.
arXiv Detail & Related papers (2020-11-22T16:43:35Z) - Early stopping and polynomial smoothing in regression with reproducing
kernels [2.132096006921048]
We study the problem of early stopping for iterative learning algorithms in a reproducing kernel Hilbert space (RKHS)
We present a data-driven rule to perform early stopping without a validation set that is based on the so-called minimum discrepancy principle.
The proposed rule is proved to be minimax-optimal over different types of kernel spaces.
arXiv Detail & Related papers (2020-07-14T05:27:18Z) - Regret and Belief Complexity Trade-off in Gaussian Process Bandits via
Information Thresholding [42.669970064867556]
We show how to characterize the trade-off between regret bounds of GP bandit algorithms and complexity of the posterior distributions.
We observe state of the art accuracy and complexity trade-offs for GP bandit algorithms applied to global optimization.
arXiv Detail & Related papers (2020-03-23T21:05:15Z) - Near-linear Time Gaussian Process Optimization with Adaptive Batching
and Resparsification [119.41129787351092]
We introduce BBKB, the first no-regret GP optimization algorithm that provably runs in near-linear time and selects candidates in batches.
We show that the same bound can be used to adaptively delay costly updates to the sparse GP approximation, achieving a near-constant per-step amortized cost.
arXiv Detail & Related papers (2020-02-23T17:43:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.