Related papers: Machine Learning for Variance Reduction in Online Experiments

Machine Learning for Variance Reduction in Online Experiments

URL: http://arxiv.org/abs/2106.07263v2
Date: Wed, 16 Jun 2021 22:28:27 GMT
Title: Machine Learning for Variance Reduction in Online Experiments
Authors: Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, Matt Goldman
Abstract summary: We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments, the estimator has over 70% lower variance than the simple difference-in-means estimator.
Score: 1.9181913148426697
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70% lower variance than the simple difference-in-means estimator, and about 19% lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.

Related papers

Practical Improvements of A/B Testing with Off-Policy Estimation [51.25970890274447]
We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach.<n>Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.
arXiv Detail & Related papers (2025-06-12T13:11:01Z)
Improving the Finite Sample Estimation of Average Treatment Effects using Double/Debiased Machine Learning with Propensity Score Calibration [0.0]
This paper investigates the use of probability calibration approaches within the Double/debiased machine learning framework. We show that calibrating propensity scores may significantly reduce the root mean squared error of DML estimates. We showcase it in an empirical example and provide conditions under which calibration does not alter the properties of the DML estimator.
arXiv Detail & Related papers (2024-09-07T17:44:01Z)
STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments [22.32661807469984]
We develop a novel framework that integrates the Student's t-distribution with machine learning tools to fit heavy-tailed metrics. By adopting a variational EM method to optimize the loglikehood function, we can infer a robust solution that greatly eliminates the negative impact of outliers. Both simulations on synthetic data and long-term empirical results on Meituan experiment platform demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-23T09:35:59Z)
Multi-CATE: Multi-Accurate Conditional Average Treatment Effect Estimation Robust to Unknown Covariate Shifts [12.289361708127876]
We use methodology for learning multi-accurate predictors to post-process CATE T-learners. We show how this approach can combine (large) confounded observational and (smaller) randomized datasets.
arXiv Detail & Related papers (2024-05-28T14:12:25Z)
Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z)
Propensity score models are better when post-calibrated [0.32228025627337864]
Post-calibration reduces the error in effect estimation for expressive uncalibrated statistical estimators. Given the improvement in effect estimation and that post-calibration is computationally cheap, we recommend it will be adopted when modelling propensity scores with expressive models.
arXiv Detail & Related papers (2022-11-02T16:01:03Z)
Expected Validation Performance and Estimation of a Random Variable's Maximum [48.83713377993604]
We analyze three statistical estimators for expected validation performance. We find the unbiased estimator has the highest variance, and the estimator with the smallest variance has the largest bias. We find that the two biased estimators lead to the fewest incorrect conclusions.
arXiv Detail & Related papers (2021-10-01T18:48:47Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
Two-Stage TMLE to Reduce Bias and Improve Efficiency in Cluster Randomized Trials [0.0]
Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals, and measure outcomes on individuals in those groups. Findings are often missing for some individuals within clusters. CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms.
arXiv Detail & Related papers (2021-06-29T21:47:30Z)
Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research. Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z)
Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.