Tighter Confidence Bounds for Sequential Kernel Regression
- URL: http://arxiv.org/abs/2403.12732v1
- Date: Tue, 19 Mar 2024 13:47:35 GMT
- Title: Tighter Confidence Bounds for Sequential Kernel Regression
- Authors: Hamish Flynn, David Reeb,
- Abstract summary: Tighter confidence bounds give rise to algorithms with better empirical performance and better performance guarantees.
We use martingale tail bounds and finite-dimensional reformulations of infinite-dimensional convex programs to establish new confidence bounds for sequential kernel regression.
- Score: 3.683202928838613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Confidence bounds are an essential tool for rigorously quantifying the uncertainty of predictions. In this capacity, they can inform the exploration-exploitation trade-off and form a core component in many sequential learning and decision-making algorithms. Tighter confidence bounds give rise to algorithms with better empirical performance and better performance guarantees. In this work, we use martingale tail bounds and finite-dimensional reformulations of infinite-dimensional convex programs to establish new confidence bounds for sequential kernel regression. We prove that our new confidence bounds are always tighter than existing ones in this setting. We apply our confidence bounds to the kernel bandit problem, where future actions depend on the previous history. When our confidence bounds replace existing ones, the KernelUCB (GP-UCB) algorithm has better empirical performance, a matching worst-case performance guarantee and comparable computational cost. Our new confidence bounds can be used as a generic tool to design improved algorithms for other kernelised learning and decision-making problems.
Related papers
- Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for
Martingale Mixtures [26.683757807252675]
We present improved algorithms with worst-case regret guarantees for the linear bandit problem.
We show that our confidence sequences are tighter than competitors, both empirically and theoretically.
arXiv Detail & Related papers (2023-09-25T17:13:46Z) - Huber-Robust Confidence Sequences [37.16361789841549]
Confidence sequences are confidence intervals that can be sequentially tracked, and are valid at arbitrary data-dependent stopping times.
We show that the resulting confidence sequences attain the optimal width achieved in the nonsequential setting.
Since confidence sequences are a common tool used within A/B/n testing and bandits, these results open the door to sequential experimentation that is robust to outliers and adversarial corruptions.
arXiv Detail & Related papers (2023-01-23T17:29:26Z) - Fast Entropy-Based Methods of Word-Level Confidence Estimation for
End-To-End Automatic Speech Recognition [86.21889574126878]
We show how per-frame entropy values can be normalized and aggregated to obtain a confidence measure per unit and per word.
We evaluate the proposed confidence measures on LibriSpeech test sets, and show that they are up to 2 and 4 times better than confidence estimation based on the maximum per-frame probability.
arXiv Detail & Related papers (2022-12-16T20:27:40Z) - Meta-Learning Hypothesis Spaces for Sequential Decision-making [79.73213540203389]
We propose to meta-learn a kernel from offline data (Meta-KeL)
Under mild conditions, we guarantee that our estimated RKHS yields valid confidence sets.
We also empirically evaluate the effectiveness of our approach on a Bayesian optimization task.
arXiv Detail & Related papers (2022-02-01T17:46:51Z) - Open Problem: Tight Online Confidence Intervals for RKHS Elements [57.363123214464764]
We formalize the question of online confidence intervals in the RKHS setting and overview the existing results.
It is unclear whether the suboptimal regret bound is a fundamental shortcoming of these algorithms or an artifact of the proof.
arXiv Detail & Related papers (2021-10-28T22:36:20Z) - Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting.
We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z) - Off-policy Confidence Sequences [33.749904615295485]
We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting.
We provide algorithms for computing these confidence sequences that strike a good balance between computational and statistical efficiency.
arXiv Detail & Related papers (2021-02-18T18:40:30Z) - An evaluation of word-level confidence estimation for end-to-end
automatic speech recognition [70.61280174637913]
We investigate confidence estimation for end-to-end automatic speech recognition (ASR)
We provide an extensive benchmark of popular confidence methods on four well-known speech datasets.
Our results suggest a strong baseline can be obtained by scaling the logits by a learnt temperature.
arXiv Detail & Related papers (2021-01-14T09:51:59Z) - CoinDICE: Off-Policy Confidence Interval Estimation [107.86876722777535]
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning.
We show in a variety of benchmarks that the confidence interval estimates are tighter and more accurate than existing methods.
arXiv Detail & Related papers (2020-10-22T12:39:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.