Combinatorial Bandits under Strategic Manipulations
- URL: http://arxiv.org/abs/2102.12722v1
- Date: Thu, 25 Feb 2021 07:57:27 GMT
- Title: Combinatorial Bandits under Strategic Manipulations
- Authors: Jing Dong, Ke Li, Shuai Li, Baoxiang Wang
- Abstract summary: We study the problem of multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest.
Our setting elaborates a more realistic model of adaptive arms that imposes relaxed assumptions compared to adversarial corruptions and adversarial attacks.
- Score: 25.882801456026584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of combinatorial multi-armed bandits (CMAB) under
strategic manipulations of rewards, where each arm can modify the emitted
reward signals for its own interest. Our setting elaborates a more realistic
model of adaptive arms that imposes relaxed assumptions compared to adversarial
corruptions and adversarial attacks. Algorithms designed under strategic arms
gain robustness in real applications while avoiding being overcautious and
hampering the performance. We bridge the gap between strategic manipulations
and adversarial attacks by investigating the optimal colluding strategy among
arms under the MAB problem. We then propose a strategic variant of the
combinatorial UCB algorithm, which has a regret of at most $O(m\log T + m
B_{max})$ under strategic manipulations, where $T$ is the time horizon, $m$ is
the number of arms, and $B_{max}$ is the maximum budget. We further provide
lower bounds on the strategic budgets for attackers to incur certain regret of
the bandit algorithm. Extensive experiments corroborate our theoretical
findings on robustness and regret bounds, in a variety of regimes of
manipulation budgets.
Related papers
- Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - Bandits Meet Mechanism Design to Combat Clickbait in Online
Recommendation [50.469872635246176]
We study a strategic variant of the multi-armed bandit problem, which we coin the strategic click-bandit.
This model is motivated by applications in online recommendation where the choice of recommended items depends on both the click-through rates and the post-click rewards.
arXiv Detail & Related papers (2023-11-27T09:19:01Z) - Robust Lipschitz Bandits to Adversarial Corruptions [61.85150061213987]
Lipschitz bandit is a variant of bandits that deals with a continuous arm set defined on a metric space.
In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions.
Our work presents the first line of robust Lipschitz bandit algorithms that can achieve sub-linear regret under both types of adversary.
arXiv Detail & Related papers (2023-05-29T18:16:59Z) - Contextual Combinatorial Bandits with Probabilistically Triggered Arms [55.9237004478033]
We study contextual bandits with probabilistically triggered arms (C$2$MAB-T) under a variety of smoothness conditions.
Under the triggering modulated (TPM) condition, we devise the C$2$-UC-T algorithm and derive a regret bound $tildeO(dsqrtT)$.
arXiv Detail & Related papers (2023-03-30T02:51:00Z) - Multi-armed Bandit Algorithm against Strategic Replication [5.235979896921492]
We consider a multi-armed bandit problem in which a set of arms is registered by each agent, and the agent receives reward when its arm is selected.
An agent might strategically submit more arms with replications, which can bring more reward by abusing the bandit algorithm's exploration-exploitation balance.
We propose a bandit algorithm which demotivates replications and also achieves a small cumulative regret.
arXiv Detail & Related papers (2021-10-23T07:38:44Z) - Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms.
Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z) - The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms [10.662105162882526]
We investigate a $k$-armed bandit problem in the emphmany-armed regime.
Our findings suggest a new form of emphfree exploration beneficial to greedy algorithms in the many-armed context.
arXiv Detail & Related papers (2020-02-24T08:59:34Z) - Action-Manipulation Attacks Against Stochastic Bandits: Attacks and
Defense [45.408568528354216]
We introduce a new class of attack named action-manipulation attack.
In this attack, an adversary can change the action signal selected by the user.
To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks.
arXiv Detail & Related papers (2020-02-19T04:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.