LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding
Linear Bandit Problem
- URL: http://arxiv.org/abs/2403.06230v1
- Date: Sun, 10 Mar 2024 15:01:50 GMT
- Title: LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding
Linear Bandit Problem
- Authors: Yun-Ang Wu, Yun-Da Tsai, Shou-De Lin
- Abstract summary: We present LinearAPT, a novel algorithm designed for the fixed budget setting of the Thresholding Linear Bandit (TLB) problem.
Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.
- Score: 4.666048091337632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a
nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on
maximizing decision accuracy against a linearly defined threshold under
resource constraints. We present LinearAPT, a novel algorithm designed for the
fixed budget setting of TLB, providing an efficient solution to optimize
sequential decision-making. This algorithm not only offers a theoretical upper
bound for estimated loss but also showcases robust performance on both
synthetic and real-world datasets. Our contributions highlight the
adaptability, simplicity, and computational efficiency of LinearAPT, making it
a valuable addition to the toolkit for addressing complex sequential
decision-making challenges.
Related papers
- A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints [4.6796315389639815]
Decentralized optimization problems are unsolvable in the presence of distributed constraints.
We introduce a novel algorithm that tracks the gradient of the objective function and the Jacobian of the constraint mapping simultaneously.
We present numerical results on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-09-08T06:57:35Z) - Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem.
It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling.
We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z) - Constrained Bi-Level Optimization: Proximal Lagrangian Value function
Approach and Hessian-free Algorithm [8.479947546216131]
We develop a Hessian-free gradient-based algorithm-termed proximal Lagrangian Value function-based Hessian-free Bi-level Algorithm (LV-HBA)
LV-HBA is especially well-suited for machine learning applications.
arXiv Detail & Related papers (2024-01-29T13:50:56Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Safe Online Bid Optimization with Return-On-Investment and Budget
Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems.
We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations.
More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.