Adaptive Experimentation When You Can't Experiment
- URL: http://arxiv.org/abs/2406.10738v1
- Date: Sat, 15 Jun 2024 20:54:48 GMT
- Title: Adaptive Experimentation When You Can't Experiment
- Authors: Yao Zhao, Kwang-Sung Jun, Tanner Fiez, Lalit Jain,
- Abstract summary: This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
- Score: 55.86593195947978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces the \emph{confounded pure exploration transductive linear bandit} (\texttt{CPET-LB}) problem. As a motivating example, often online services cannot directly assign users to specific control or treatment experiences either for business or practical reasons. In these settings, naively comparing treatment and control groups that may result from self-selection can lead to biased estimates of underlying treatment effects. Instead, online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment. Our methodology provides online services with an adaptive experimental design approach for learning the best-performing treatment for such \textit{encouragement designs}. We consider a more general underlying model captured by a linear structural equation and formulate pure exploration linear bandits in this setting. Though pure exploration has been extensively studied in standard adaptive experimental design settings, we believe this is the first work considering a setting where noise is confounded. Elimination-style algorithms using experimental design methods in combination with a novel finite-time confidence interval on an instrumental variable style estimator are presented with sample complexity upper bounds nearly matching a minimax lower bound. Finally, experiments are conducted that demonstrate the efficacy of our approach.
Related papers
- Learnable Chernoff Baselines for Inference-Time Alignment [64.81256817158851]
We introduce Learnable Chernoff Baselines as a method for efficiently and approximately sampling from exponentially tilted kernels.<n>We establish total-variation guarantees to the ideal aligned model, and demonstrate in both continuous and discrete diffusion settings that LCB sampling closely matches ideal rejection sampling.
arXiv Detail & Related papers (2026-02-08T00:09:40Z) - Combating Noisy Labels through Fostering Self- and Neighbor-Consistency [120.4394402099635]
Label noise is pervasive in various real-world scenarios, posing challenges in supervised deep learning.<n>We propose a noise-robust method named Jo-SNC (textbfJoint sample selection and model regularization based on textbfSelf- and textbfNeighbor-textbfConsistency)<n>We design a self-adaptive, data-driven thresholding scheme to adjust per-class selection thresholds.
arXiv Detail & Related papers (2026-01-19T07:55:29Z) - Algorithm Adaptation Bias in Recommendation System Online Experiments [4.8862630578310435]
An underexplored but critical bias is algorithm adaptation effect.<n>Results often favor the production variant with large traffic while underestimating the performance of the test variant with small traffic.<n>We detail the mechanisms of this bias, present empirical evidence from real-world experiments, and discuss potential methods for a more robust online evaluation.
arXiv Detail & Related papers (2025-08-29T19:23:04Z) - Active Human Feedback Collection via Neural Contextual Dueling Bandits [84.7608942821423]
We propose Neural-ADB, an algorithm for collecting human preference feedback when the underlying latent reward function is non-linear.
We show that when preference feedback follows the Bradley-Terry-Luce model, the worst sub-optimality gap of the policy learned by Neural-ADB decreases at a sub-linear rate as the preference dataset increases.
arXiv Detail & Related papers (2025-04-16T12:16:10Z) - Can We Validate Counterfactual Estimations in the Presence of General Network Interference? [13.49152464081862]
We introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference.<n>New distribution-preserving network bootstrap generates statistically-valid subpopulations from a single experiment's data.<n>Counterfactual cross-validation procedure adapts the principles of model validation to the unique constraints of causal settings.
arXiv Detail & Related papers (2025-02-03T06:51:04Z) - Optimal Adaptive Experimental Design for Estimating Treatment Effect [14.088972921434761]
This paper addresses the fundamental question of determining the optimal accuracy in estimating the treatment effect.
By incorporating the concept of doubly robust method into sequential experimental design, we frame the optimal estimation problem as an online bandit learning problem.
Using tools and ideas from both bandit algorithm design and adaptive statistical estimation, we propose a general low switching adaptive experiment framework.
arXiv Detail & Related papers (2024-10-07T23:22:51Z) - Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach [13.208141830901845]
We show that the standard difference-in-means estimator can lead to biased estimates due to recommender interference.
We propose a "recommender choice model" that describes which item gets exposed from a pool containing both treated and control items.
We show that the proposed estimator yields results comparable to the benchmark, whereas the standard difference-in-means estimator can exhibit significant bias and even produce reversed signs.
arXiv Detail & Related papers (2024-06-20T14:53:26Z) - Machine Learning Assisted Adjustment Boosts Efficiency of Exact Inference in Randomized Controlled Trials [12.682443719767763]
We show the proposed method can robustly control the type I error and can boost the statistical efficiency for a randomized controlled trial (RCT)
Its application may remarkably reduce the required sample size and cost of RCTs, such as phase III clinical trials.
arXiv Detail & Related papers (2024-03-05T15:48:07Z) - Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables.
In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy.
Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Adaptive Experimentation at Scale: A Computational Framework for
Flexible Batches [7.390918770007728]
Motivated by practical instances involving a handful of reallocations in which outcomes are measured in batches, we develop an adaptive-driven experimentation framework.
Our main observation is that normal approximations, which are universal in statistical inference, can also guide the design of adaptive algorithms.
arXiv Detail & Related papers (2023-03-21T04:17:03Z) - Synthetically Controlled Bandits [2.8292841621378844]
This paper presents a new dynamic approach to experiment design in settings where, due to interference or other concerns, experimental units are coarse.
Our new design, dubbed Synthetically Controlled Thompson Sampling (SCTS), minimizes the regret associated with experimentation at no practically meaningful loss to inferential ability.
arXiv Detail & Related papers (2022-02-14T22:58:13Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z) - Almost-Matching-Exactly for Treatment Effect Estimation under Network
Interference [73.23326654892963]
We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network.
Our method matches units almost exactly on counts of unique subgraphs within their neighborhood graphs.
arXiv Detail & Related papers (2020-03-02T15:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.