Blackwell's Approachability with Approximation Algorithms
- URL: http://arxiv.org/abs/2502.03919v1
- Date: Thu, 06 Feb 2025 09:54:00 GMT
- Title: Blackwell's Approachability with Approximation Algorithms
- Authors: Dan Garber, Mhna Massalha,
- Abstract summary: We consider a repeated vector-valued game between a player and an adversary.
In case only the player's or adversary's set is equipped with an approximation algorithm, we give simpler and more efficient algorithms.
- Score: 15.266876140036052
- License:
- Abstract: We revisit Blackwell's celebrated approachability problem which considers a repeated vector-valued game between a player and an adversary. Motivated by settings in which the action set of the player or adversary (or both) is difficult to optimize over, for instance when it corresponds to the set of all possible solutions to some NP-Hard optimization problem, we ask what can the player guarantee \textit{efficiently}, when only having access to these sets via approximation algorithms with ratios $\alpha_{\mX} \geq 1$ and $ 1 \geq \alpha_{\mY} > 0$, respectively. Assuming the player has monotone preferences, in the sense that he does not prefer a vector-valued loss $\ell_1$ over $\ell_2$ if $\ell_2 \leq \ell_1$, we establish that given a Blackwell instance with an approachable target set $S$, the downward closure of the appropriately-scaled set $\alpha_{\mX}\alpha_{\mY}^{-1}S$ is \textit{efficiently} approachable with optimal rate. In case only the player's or adversary's set is equipped with an approximation algorithm, we give simpler and more efficient algorithms.
Related papers
- Corrupted Learning Dynamics in Games [62.73758165845971]
An equilibrium can be computed at a fast rate of $O(log T)$ when all players follow the optimistic follow-the-regularized-leader (OFTRL)
We present corrupted learning dynamics that adaptively find an equilibrium at a rate that depends on the extent to which each player deviates from the strategy suggested by the prescribed algorithm.
arXiv Detail & Related papers (2024-12-10T02:23:44Z) - Dueling Optimization with a Monotone Adversary [35.850072415395395]
We study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization.
The goal is to design an online algorithm to find a minimizer $mathbfx*$ for a function $fcolon X to mathbbRd.
arXiv Detail & Related papers (2023-11-18T23:55:59Z) - Chasing Convex Bodies and Functions with Black-Box Advice [7.895232155155041]
We consider the problem of convex function chasing with black-box advice.
An online decision-maker seeks cost comparable to the advice when it performs well, known as $textitconsistency$.
We propose two novel algorithms that bypass this limitation by exploiting the problem's convexity.
arXiv Detail & Related papers (2022-06-23T15:30:55Z) - Corralling a Larger Band of Bandits: A Case Study on Switching Regret
for Linear Bandits [99.86860277006318]
We consider the problem of combining and learning over a set of adversarial algorithms with the goal of adaptively tracking the best one on the fly.
The CORRAL of Agarwal et al. achieves this goal with a regret overhead of order $widetildeO(sqrtd S T)$ where $M$ is the number of base algorithms and $T$ is the time horizon.
Motivated by this issue, we propose a new recipe to corral a larger band of bandit algorithms whose regret overhead has only emphlogarithmic dependence on $M$ as long
arXiv Detail & Related papers (2022-02-12T21:55:44Z) - Near-Optimal Learning of Extensive-Form Games with Imperfect Information [54.55092907312749]
We present the first line of algorithms that require only $widetildemathcalO((XA+YB)/varepsilon2)$ episodes of play to find an $varepsilon$-approximate Nash equilibrium in two-player zero-sum games.
This improves upon the best known sample complexity of $widetildemathcalO((X2A+Y2B)/varepsilon2)$ by a factor of $widetildemathcalO(maxX,
arXiv Detail & Related papers (2022-02-03T18:18:28Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Near-Optimal Reinforcement Learning with Self-Play [50.29853537456737]
We focus on self-play algorithms which learn the optimal policy by playing against itself without any direct supervision.
We propose an optimistic variant of the emphNash Q-learning algorithm with sample complexity $tildemathcalO(SAB)$, and a new emphNash V-learning algorithm with sample complexity $tildemathcalO(S(A+B))$.
arXiv Detail & Related papers (2020-06-22T05:00:13Z) - Explicit Best Arm Identification in Linear Bandits Using No-Regret
Learners [17.224805430291177]
We study the problem of best arm identification in linearly parameterised multi-armed bandits.
We propose an explicitly implementable and provably order-optimal sample-complexity algorithm to solve this problem.
arXiv Detail & Related papers (2020-06-13T05:00:01Z) - Problem-Complexity Adaptive Model Selection for Stochastic Linear
Bandits [20.207989166682832]
We consider the problem of model selection for two popular linear bandit settings.
In the first setting, we consider the $K$ armed mixture bandits, where the mean reward of arm $i in [K]$ is $mu_i+ langle alpha_i,t,theta*|$.
We show that ALB achieves regret scaling of $O(|theta*|sqrtT)$, where $|theta*|$ is apriori unknown
arXiv Detail & Related papers (2020-06-04T02:19:00Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.