Machine Learning for Variance Reduction in Online Experiments
- URL: http://arxiv.org/abs/2106.07263v2
- Date: Wed, 16 Jun 2021 22:28:27 GMT
- Title: Machine Learning for Variance Reduction in Online Experiments
- Authors: Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener,
Matt Goldman
- Abstract summary: We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE.
MLRATE uses machine learning predictors of the outcome to reduce estimator variance.
In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments, the estimator has over 70% lower variance than the simple difference-in-means estimator.
- Score: 1.9181913148426697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of variance reduction in randomized controlled
trials, through the use of covariates correlated with the outcome but
independent of the treatment. We propose a machine learning regression-adjusted
treatment effect estimator, which we call MLRATE. MLRATE uses machine learning
predictors of the outcome to reduce estimator variance. It employs
cross-fitting to avoid overfitting biases, and we prove consistency and
asymptotic normality under general conditions. MLRATE is robust to poor
predictions from the machine learning step: if the predictions are uncorrelated
with the outcomes, the estimator performs asymptotically no worse than the
standard difference-in-means estimator, while if predictions are highly
correlated with outcomes, the efficiency gains are large. In A/A tests, for a
set of 48 outcome metrics commonly monitored in Facebook experiments the
estimator has over 70% lower variance than the simple difference-in-means
estimator, and about 19% lower variance than the common univariate procedure
which adjusts only for pre-experiment values of the outcome.
Related papers
- STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments [22.32661807469984]
We develop a novel framework that integrates the Student's t-distribution with machine learning tools to fit heavy-tailed metrics.
By adopting a variational EM method to optimize the loglikehood function, we can infer a robust solution that greatly eliminates the negative impact of outliers.
Both simulations on synthetic data and long-term empirical results on Meituan experiment platform demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-23T09:35:59Z) - Multi-CATE: Multi-Accurate Conditional Average Treatment Effect Estimation Robust to Unknown Covariate Shifts [12.289361708127876]
We use methodology for learning multi-accurate predictors to post-process CATE T-learners.
We show how this approach can combine (large) confounded observational and (smaller) randomized datasets.
arXiv Detail & Related papers (2024-05-28T14:12:25Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Propensity score models are better when post-calibrated [0.32228025627337864]
Post-calibration reduces the error in effect estimation for expressive uncalibrated statistical estimators.
Given the improvement in effect estimation and that post-calibration is computationally cheap, we recommend it will be adopted when modelling propensity scores with expressive models.
arXiv Detail & Related papers (2022-11-02T16:01:03Z) - Expected Validation Performance and Estimation of a Random Variable's
Maximum [48.83713377993604]
We analyze three statistical estimators for expected validation performance.
We find the unbiased estimator has the highest variance, and the estimator with the smallest variance has the largest bias.
We find that the two biased estimators lead to the fewest incorrect conclusions.
arXiv Detail & Related papers (2021-10-01T18:48:47Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Two-Stage TMLE to Reduce Bias and Improve Efficiency in Cluster
Randomized Trials [0.0]
Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals, and measure outcomes on individuals in those groups.
Findings are often missing for some individuals within clusters.
CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms.
arXiv Detail & Related papers (2021-06-29T21:47:30Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.