Related papers: Exploration-free Algorithms for Multi-group Mean Estimation

Exploration-free Algorithms for Multi-group Mean Estimation

URL: http://arxiv.org/abs/2510.10374v1
Date: Sun, 12 Oct 2025 00:20:30 GMT
Title: Exploration-free Algorithms for Multi-group Mean Estimation
Authors: Ziyi Wei, Huaiyang Zhong, Xiaocheng Li,
Abstract summary: We address the problem of multi-group mean estimation, which seeks to allocate a finite sampling budget across multiple groups to obtain uniformly accurate estimates of their means.<n>Unlike classical multi-armed bandits, whose objective is to minimize regret by identifying and exploiting the best arm, the optimal allocation in this setting requires sampling every group on the order of $Theta(T)$ times.
Score: 7.480522058240762
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We address the problem of multi-group mean estimation, which seeks to allocate a finite sampling budget across multiple groups to obtain uniformly accurate estimates of their means. Unlike classical multi-armed bandits, whose objective is to minimize regret by identifying and exploiting the best arm, the optimal allocation in this setting requires sampling every group on the order of $\Theta(T)$ times. This fundamental distinction makes exploration-free algorithms both natural and effective. Our work makes three contributions. First, we strengthen the existing results on subgaussian variance concentration using the Hanson-Wright inequality and identify a class of strictly subgaussian distributions that yield sharper guarantees. Second, we design exploration-free non-adaptive and adaptive algorithms, and we establish tighter regret bounds than the existing results. Third, we extend the framework to contextual bandit settings, an underexplored direction, and propose algorithms that leverage side information with provable guarantees. Overall, these results position exploration-free allocation as a principled and efficient approach to multi-group mean estimation, with potential applications in experimental design, personalization, and other domains requiring accurate multi-group inference.

Related papers

Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z)
Constrained Pareto Set Identification with Bandit Feedback [10.967572582187014]
Given a $K$-armed bandit with unknown means, the goal is to identify the set of arms whose mean is not uniformly worse than that of another arm.<n>Our focus lies in fixed-confidence identification, for which we introduce an algorithm that significantly outperforms racing-like algorithms.
arXiv Detail & Related papers (2025-06-09T18:29:28Z)
Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Bandit Pareto Set Identification: the Fixed Budget Setting [10.967572582187014]
We study a pure exploration problem in a multi-armed bandit model.<n>The goal is to identify the distributions whose mean is not uniformly worse than that of another distribution.
arXiv Detail & Related papers (2023-11-07T13:43:18Z)
Beyond Submodularity: A Unified Framework of Randomized Set Selection with Group Fairness Constraints [19.29174615532181]
We introduce a unified framework for randomized subset selection that incorporates group fairness constraints. Our problem involves a global utility function and a set of group utility functions for each group. Our aim is to generate a distribution across feasible subsets, specifying the selection probability of each feasible set.
arXiv Detail & Related papers (2023-04-13T15:02:37Z)
Group conditional validity via multi-group learning [5.797821810358083]
We consider the problem of distribution-free conformal prediction and the criterion of group conditional validity. Existing methods achieve such guarantees under either restrictive grouping structure or distributional assumptions. We propose a simple reduction to the problem of achieving validity guarantees for individual populations by leveraging algorithms for a problem called multi-group learning.
arXiv Detail & Related papers (2023-03-07T15:51:03Z)
Robust Consensus Clustering and its Applications for Advertising Forecasting [18.242055675730253]
We propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions. We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks.
arXiv Detail & Related papers (2022-12-27T21:49:04Z)
Max-Min Grouped Bandits [48.62520520818357]
We introduce a multi-armed bandit problem termed max-min grouped bandits. The goal is to find a group whose worst arm has the highest mean reward. This problem is of interest to applications such as recommendation systems.
arXiv Detail & Related papers (2021-11-17T01:59:15Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains. We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.