Multi-Metric Adaptive Experimental Design under Fixed Budget with Validation
- URL: http://arxiv.org/abs/2506.03062v1
- Date: Tue, 03 Jun 2025 16:41:11 GMT
- Title: Multi-Metric Adaptive Experimental Design under Fixed Budget with Validation
- Authors: Qining Zhang, Tanner Fiez, Yi Liu, Wenyang Liu,
- Abstract summary: Standard A/B tests in online experiments face statistical power challenges when testing multiple candidates simultaneously.<n>This paper proposes a fixed-budget multi-metric AED framework with a two-phase structure: an adaptive exploration phase to identify the best treatment, and a validation phase to verify the treatment's quality and infer statistics.
- Score: 10.5481503979787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard A/B tests in online experiments face statistical power challenges when testing multiple candidates simultaneously, while adaptive experimental designs (AED) alone fall short in inferring experiment statistics such as the average treatment effect, especially with many metrics (e.g., revenue, safety) and heterogeneous variances. This paper proposes a fixed-budget multi-metric AED framework with a two-phase structure: an adaptive exploration phase to identify the best treatment, and a validation phase with an A/B test to verify the treatment's quality and infer statistics. We propose SHRVar, which generalizes sequential halving (SH) (Karnin et al., 2013) with a novel relative-variance-based sampling and an elimination strategy built on reward z-values. It achieves a provable error probability that decreases exponentially, where the exponent generalizes the complexity measure for SH (Karnin et al., 2013) and SHVar (Lalitha et al., 2023) with homogeneous and heterogeneous variances, respectively. Numerical experiments verify our analysis and demonstrate the superior performance of this new framework.
Related papers
- Practical Improvements of A/B Testing with Off-Policy Estimation [51.25970890274447]
We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach.<n>Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.
arXiv Detail & Related papers (2025-06-12T13:11:01Z) - A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - Heteroscedastic Double Bayesian Elastic Net [1.1240642213359266]
We propose the Heteroscedastic Double Bayesian Elastic Net (HDBEN), a novel framework that jointly models the mean and log- variance.<n>Our approach simultaneously induces sparsity and grouping in the regression coefficients and variance parameters, capturing complex variance structures in the data.
arXiv Detail & Related papers (2025-02-04T05:44:19Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Simultaneous inference for generalized linear models with unmeasured confounders [0.0]
We propose a unified statistical estimation and inference framework that harnesses structures and integrates linear projections into three key stages.<n>We show effective Type-I error control of $z$-tests as sample and response sizes approach infinity.
arXiv Detail & Related papers (2023-09-13T18:53:11Z) - A VAE Approach to Sample Multivariate Extremes [6.548734807475054]
This paper describes a variational autoencoder (VAE) approach for sampling heavy-tailed distributions likely to have extremes of particularly large intensities.
We illustrate the relevance of our approach on a synthetic data set and on a real data set of discharge measurements along the Danube river network.
In addition to outperforming the standard VAE for the tested data sets, we also provide a comparison with a competing EVT-based generative approach.
arXiv Detail & Related papers (2023-06-19T14:53:40Z) - Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference
Under Heterogeneity [5.8010446129208155]
We develop a new nonparametric testing procedure that accurately detects differences between the two samples.
A comprehensive simulation study and an application to detecting user behaviors in online games demonstrates the excellent non-asymptotic performance of the proposed test.
arXiv Detail & Related papers (2023-04-26T22:25:44Z) - Two-stage Hypothesis Tests for Variable Interactions with FDR Control [10.750902543185802]
We propose a two-stage testing procedure with false discovery rate (FDR) control, which is known as a less conservative multiple-testing correction.
We demonstrate via comprehensive simulation studies that our two-stage procedure is more efficient than the classical BH procedure, with a comparable or improved statistical power.
arXiv Detail & Related papers (2022-08-31T19:17:00Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - GANs with Variational Entropy Regularizers: Applications in Mitigating
the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples.
GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution.
We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.