Collaborative Min-Max Regret in Grouped Multi-Armed Bandits
- URL: http://arxiv.org/abs/2506.10313v1
- Date: Thu, 12 Jun 2025 02:56:35 GMT
- Title: Collaborative Min-Max Regret in Grouped Multi-Armed Bandits
- Authors: Moïse Blanchard, Vineet Goyal,
- Abstract summary: We study the impact of sharing exploration in multi-armed bandits in a grouped setting.<n>We introduce an algorithm that dynamically coordinates exploration across groups.
- Score: 6.675805308519987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the impact of sharing exploration in multi-armed bandits in a grouped setting where a set of groups have overlapping feasible action sets [Baek and Farias '24]. In this grouped bandit setting, groups share reward observations, and the objective is to minimize the collaborative regret, defined as the maximum regret across groups. This naturally captures applications in which one aims to balance the exploration burden between groups or populations -- it is known that standard algorithms can lead to significantly imbalanced exploration cost between groups. We address this problem by introducing an algorithm Col-UCB that dynamically coordinates exploration across groups. We show that Col-UCB achieves both optimal minimax and instance-dependent collaborative regret up to logarithmic factors. These bounds are adaptive to the structure of shared action sets between groups, providing insights into when collaboration yields significant benefits over each group learning their best action independently.
Related papers
- Reinforcement learning with combinatorial actions for coupled restless bandits [62.89013331120493]
We propose SEQUOIA, an RL algorithm that directly optimize for long-term reward over the feasible action space.<n>We empirically validate SEQUOIA on four novel restless bandit problems with constraints: multiple interventions, path constraints, bipartite matching, and capacity constraints.
arXiv Detail & Related papers (2025-03-01T21:25:21Z) - Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits [24.590517939890788]
We study a new collaborative setting, consisting of $N$ agents such that each agent is learning one of $M$ multi-armed bandits.
We develop algorithms which facilitate collaboration between the agents under two scenarios.
arXiv Detail & Related papers (2023-05-30T06:35:49Z) - Beyond Submodularity: A Unified Framework of Randomized Set Selection
with Group Fairness Constraints [19.29174615532181]
We introduce a unified framework for randomized subset selection that incorporates group fairness constraints.
Our problem involves a global utility function and a set of group utility functions for each group.
Our aim is to generate a distribution across feasible subsets, specifying the selection probability of each feasible set.
arXiv Detail & Related papers (2023-04-13T15:02:37Z) - AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization [109.91265884632239]
Group distributionally robust optimization (G-DRO) can minimize the worst-case loss over a set of pre-defined groups over training data.
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization.
AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches.
arXiv Detail & Related papers (2022-12-02T00:57:03Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Max-Min Grouped Bandits [48.62520520818357]
We introduce a multi-armed bandit problem termed max-min grouped bandits.
The goal is to find a group whose worst arm has the highest mean reward.
This problem is of interest to applications such as recommendation systems.
arXiv Detail & Related papers (2021-11-17T01:59:15Z) - Focus on the Common Good: Group Distributional Robustness Follows [47.62596240492509]
This paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups.
While Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features.
arXiv Detail & Related papers (2021-10-06T09:47:41Z) - Group Collaborative Learning for Co-Salient Object Detection [152.67721740487937]
We present a novel group collaborative learning framework (GCoNet) capable of detecting co-salient objects in real time (16ms)
Extensive experiments on three challenging benchmarks, i.e., CoCA, CoSOD3k, and Cosal2015, demonstrate that our simple GCoNet outperforms 10 cutting-edge models and achieves the new state-of-the-art.
arXiv Detail & Related papers (2021-03-15T13:16:03Z) - GroupIM: A Mutual Information Maximization Framework for Neural Group
Recommendation [24.677145454396822]
We study the problem of making item recommendations to ephemeral groups, which comprise users with limited or no historical activities together.
Existing studies target persistent groups with substantial activity history, while ephemeral groups lack historical interactions.
We propose data-driven regularization strategies to exploit both the preference covariance amongst users who are in the same group, as well as the contextual relevance of users' individual preferences to each group.
arXiv Detail & Related papers (2020-06-05T23:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.