Related papers: LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

URL: http://arxiv.org/abs/2403.06230v1
Date: Sun, 10 Mar 2024 15:01:50 GMT
Title: LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem
Authors: Yun-Ang Wu, Yun-Da Tsai, Shou-De Lin
Abstract summary: We present LinearAPT, a novel algorithm designed for the fixed budget setting of the Thresholding Linear Bandit (TLB) problem. Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.
Score: 4.666048091337632
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on maximizing decision accuracy against a linearly defined threshold under resource constraints. We present LinearAPT, a novel algorithm designed for the fixed budget setting of TLB, providing an efficient solution to optimize sequential decision-making. This algorithm not only offers a theoretical upper bound for estimated loss but also showcases robust performance on both synthetic and real-world datasets. Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.

Related papers

Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms [1.4747234049753448]
In high-stakes engineering applications, optimization algorithms must come with provable provablecase guarantees over a mathematically defined class of problems.<n>We describe the class of algorithms that achieve linear convergence for classes of nonsmooth composite optimization problems.
arXiv Detail & Related papers (2025-08-01T16:56:42Z)
Scalable First-order Method for Certifying Optimal k-Sparse GLMs [9.613635592922174]
We propose a first-order proximal gradient algorithm to solve the perspective relaxation of the problem within a BnB framework. We show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems.
arXiv Detail & Related papers (2025-02-13T17:14:18Z)
Safe and Efficient Online Convex Optimization with Linear Budget Constraints and Partial Feedback [3.5554907645160605]
This paper studies online convex optimization with unknown linear budget constraints. We propose a safe and efficient Lyapunov-optimization algorithm (SELO) that can achieve an $O(sqrtT)$ regret and zero cumulative constraint violation.
arXiv Detail & Related papers (2024-12-05T08:58:41Z)
A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints [4.6796315389639815]
Decentralized optimization problems are unsolvable in the presence of distributed constraints. We introduce a novel algorithm that tracks the gradient of the objective function and the Jacobian of the constraint mapping simultaneously. We present numerical results on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-09-08T06:57:35Z)
Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem. It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z)
Constrained Bi-Level Optimization: Proximal Lagrangian Value function Approach and Hessian-free Algorithm [8.479947546216131]
We develop a Hessian-free gradient-based algorithm-termed proximal Lagrangian Value function-based Hessian-free Bi-level Algorithm (LV-HBA) LV-HBA is especially well-suited for machine learning applications.
arXiv Detail & Related papers (2024-01-29T13:50:56Z)
High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance. We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation. We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z)
Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints. The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z)
Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities. We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability. The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z)
Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems. We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations. More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z)
Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints. We develop novel and simple optimization formulations for these problems. As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.