Related papers: Testing the Feasibility of Linear Programs with Bandit Feedback

Testing the Feasibility of Linear Programs with Bandit Feedback

URL: http://arxiv.org/abs/2406.15648v1
Date: Fri, 21 Jun 2024 20:56:35 GMT
Title: Testing the Feasibility of Linear Programs with Bandit Feedback
Authors: Aditya Gangrade, Aditya Gopalan, Venkatesh Saligrama, Clayton Scott,
Abstract summary: We develop a test based on low-regret algorithms and a nonasymptotic law of iterated logarithms. We prove that this test is reliable, and adapts to the signal level,' $Gamma,$ of any instance. We complement this by a minimax lower bound $(Omegad/Gamma2)$ for sample costs of reliable tests.
Score: 53.40256244941895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While the recent literature has seen a surge in the study of constrained bandit problems, all existing methods for these begin by assuming the feasibility of the underlying problem. We initiate the study of testing such feasibility assumptions, and in particular address the problem in the linear bandit setting, thus characterising the costs of feasibility testing for an unknown linear program using bandit feedback. Concretely, we test if $\exists x: Ax \ge 0$ for an unknown $A \in \mathbb{R}^{m \times d}$, by playing a sequence of actions $x_t\in \mathbb{R}^d$, and observing $Ax_t + \mathrm{noise}$ in response. By identifying the hypothesis as determining the sign of the value of a minimax game, we construct a novel test based on low-regret algorithms and a nonasymptotic law of iterated logarithms. We prove that this test is reliable, and adapts to the `signal level,' $\Gamma,$ of any instance, with mean sample costs scaling as $\widetilde{O}(d^2/\Gamma^2)$. We complement this by a minimax lower bound of $\Omega(d/\Gamma^2)$ for sample costs of reliable tests, dominating prior asymptotic lower bounds by capturing the dependence on $d$, and thus elucidating a basic insight missing in the extant literature on such problems.

Related papers

Locally minimax optimal and dimension-agnostic discrete argmin inference [33.17951971728784]
This paper tackles a fundamental inference problem: given $n$ observations from a $d$ dimensional vector with unknown mean $boldsymbolmu$, we must form a confidence set for the index corresponding to the smallest component of $boldsymbolmu$. By duality, we reduce this to testing, for each $r$ in $1,ldots,d$, whether $mu_r$ is the smallest. We propose "dimension-agnostic" tests that maintain validity regardless of how $d$ scales with $n$, and regardless of arbitrary ties in $bold
arXiv Detail & Related papers (2025-03-27T16:06:07Z)
On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization [34.23960368886818]
We investigate the necessity of memorization in convex optimization (SCO) under $ell_p$ geometries. Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO.
arXiv Detail & Related papers (2025-02-24T18:10:06Z)
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We present an algorithm that tackles newthis problem in its most general distribution-independent setting. This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z)
Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem. We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph. Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z)
Estimating Optimal Policy Value in General Linear Contextual Bandits [50.008542459050155]
In many bandit problems, the maximal reward achievable by a policy is often unknown in advance. We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable. We present a more practical, computationally efficient algorithm that estimates a problem-dependent upper bound on $V*$.
arXiv Detail & Related papers (2023-02-19T01:09:24Z)
Nearly Optimal Algorithms for Level Set Estimation [21.83736847203543]
We provide a new approach to the level set estimation problem by relating it to recent adaptive experimental design methods for linear bandits. We show that our bounds are nearly optimal, namely, our upper bounds match existing lower bounds for threshold linear bandits.
arXiv Detail & Related papers (2021-11-02T17:45:02Z)
Inferring Hidden Structures in Random Graphs [13.031167737538881]
We study the two inference problems of detecting and recovering an isolated community of emphgeneral structure planted in a random graph. We derive lower bounds for detecting/recovering the structure $Gamma_k$ in terms of the parameters $(n,k,q)$, as well as certain properties of $Gamma_k$, and exhibit computationally optimal algorithms that achieve these lower bounds.
arXiv Detail & Related papers (2021-10-05T09:39:51Z)
Statistical Query Lower Bounds for List-Decodable Linear Regression [55.06171096484622]
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Our main result is a Statistical Query (SQ) lower bound of $dmathrmpoly (1/alpha)$ for this problem.
arXiv Detail & Related papers (2021-06-17T17:45:21Z)
Thresholded Lasso Bandit [70.17389393497125]
Thresholded Lasso bandit is an algorithm that estimates the vector defining the reward function as well as its sparse support. We establish non-asymptotic regret upper bounds scaling as $mathcalO( log d + sqrtT )$ in general, and as $mathcalO( log d + sqrtT )$ under the so-called margin condition.
arXiv Detail & Related papers (2020-10-22T19:14:37Z)
Stochastic Bandits with Linear Constraints [69.757694218456]
We study a constrained contextual linear bandit setting, where the goal of the agent is to produce a sequence of policies. We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB)
arXiv Detail & Related papers (2020-06-17T22:32:19Z)
Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs [35.988644745703645]
We analyze the linear bandits with heavy-tailed payoffs, where the payoffs admit finite $1+epsilon$ moments. We propose two novel algorithms which enjoy a sublinear regret bound of $widetildeO(dfrac12Tfrac11+epsilon)$.
arXiv Detail & Related papers (2020-04-28T13:01:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.