Related papers: BOF-UCB: A Bayesian-Optimistic Frequentist Algorithm for Non-Stationary Contextual Bandits

BOF-UCB: A Bayesian-Optimistic Frequentist Algorithm for Non-Stationary Contextual Bandits

URL: http://arxiv.org/abs/2307.03587v2
Date: Wed, 19 Jul 2023 13:23:29 GMT
Title: BOF-UCB: A Bayesian-Optimistic Frequentist Algorithm for Non-Stationary Contextual Bandits
Authors: Nicklas Werge, Abdullah Akg\"ul, Melih Kandemir
Abstract summary: We propose a novel Bayesian-Optimistic Frequentist Upper Confidence Bound (BOF-UCB) algorithm for contextual linear bandits in non-stationary environments. This unique combination of Bayesian and frequentist principles enhances adaptability and performance in dynamic settings.
Score: 16.59103967569845
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel Bayesian-Optimistic Frequentist Upper Confidence Bound (BOF-UCB) algorithm for stochastic contextual linear bandits in non-stationary environments. This unique combination of Bayesian and frequentist principles enhances adaptability and performance in dynamic settings. The BOF-UCB algorithm utilizes sequential Bayesian updates to infer the posterior distribution of the unknown regression parameter, and subsequently employs a frequentist approach to compute the Upper Confidence Bound (UCB) by maximizing the expected reward over the posterior distribution. We provide theoretical guarantees of BOF-UCB's performance and demonstrate its effectiveness in balancing exploration and exploitation on synthetic datasets and classical control tasks in a reinforcement learning setting. Our results show that BOF-UCB outperforms existing methods, making it a promising solution for sequential decision-making in non-stationary environments.

Related papers

BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z)
Stochastic Primal-Dual Double Block-Coordinate for Two-way Partial AUC Maximization [56.805574957824135]
Two-way partial AUCAUC is a critical performance metric for binary classification with imbalanced data.<n>Existing algorithms for TPAUC optimization remain under-explored.<n>We introduce two innovative double-coordinate block-coordinate algorithms for TPAUC optimization.
arXiv Detail & Related papers (2025-05-28T03:55:05Z)
Batched Nonparametric Bandits via k-Nearest Neighbor UCB [0.0]
We study sequential decision-making in batched nonparametric contextual bandits.<n>We propose a nonparametric algorithm that combines adaptive k-nearest neighbor (k-NN) regression with the upper confidence bound (UCB) principle.<n>Our method, BaNk-UCB, is fully nonparametric, adapts to the context dimension, and is simple to implement.
arXiv Detail & Related papers (2025-05-15T17:00:51Z)
Robust Bayesian Optimization via Localized Online Conformal Prediction [37.549297668783254]
We introduce localized online conformal prediction-based Bayesian optimization (LOCBO) LOCBO calibrates the GP model through localized online conformal prediction (CP) We provide theoretical performance guarantees for LOCBO's iterates that hold for the unobserved objective function.
arXiv Detail & Related papers (2024-11-26T12:45:54Z)
Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose. In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z)
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions [0.0]
We propose the LABCAT algorithm, which extends trust-region-based BO. We show that the algorithm outperforms several state-of-the-art BO and other black-box optimization algorithms.
arXiv Detail & Related papers (2023-11-19T13:56:24Z)
Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW) We derive regret bounds for CBO-MW that naturally depend on graph-related quantities. Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z)
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs) This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models. In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z)
Model-based Causal Bayesian Optimization [78.120734120667]
We propose model-based causal Bayesian optimization (MCBO) MCBO learns a full system model instead of only modeling intervention-reward pairs. Unlike in standard Bayesian optimization, our acquisition function cannot be evaluated in closed form.
arXiv Detail & Related papers (2022-11-18T14:28:21Z)
Neighbor Regularized Bayesian Optimization for Hyperparameter Optimization [12.544312247050236]
We propose a novel BO algorithm called Neighbor Regularized Bayesian Optimization (NRBO) to solve the problem. We first propose a neighbor-based regularization to smooth each sample observation, which could reduce the observation noise efficiently without any extra training cost. We conduct experiments on the bayesmark benchmark and important computer vision benchmarks such as ImageNet and COCO.
arXiv Detail & Related papers (2022-10-07T12:08:01Z)
Batch Bayesian optimisation via density-ratio estimation with guarantees [26.052368583196426]
We present a theoretical analysis of BORE's regret and an extension of the algorithm with improved uncertainty estimates. We also show that BORE can be naturally extended to a batch optimisation setting by recasting the problem as approximate Bayesian inference.
arXiv Detail & Related papers (2022-09-22T00:42:18Z)
Bayesian Bellman Operators [55.959376449737405]
We introduce a novel perspective on Bayesian reinforcement learning (RL) Our framework is motivated by the insight that when bootstrapping is introduced, model-free approaches actually infer a posterior over Bellman operators, not value functions.
arXiv Detail & Related papers (2021-06-09T12:20:46Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)
Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning [6.16852156844376]
We describe a novel framework, Inferential Induction, for correctly inferring value function distributions from data. We experimentally demonstrate that the proposed algorithm is competitive with respect to the state of the art.
arXiv Detail & Related papers (2020-02-08T06:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.