Choosing a Proxy Metric from Past Experiments
- URL: http://arxiv.org/abs/2309.07893v2
- Date: Sat, 15 Jun 2024 19:56:22 GMT
- Title: Choosing a Proxy Metric from Past Experiments
- Authors: Nilesh Tripuraneni, Lee Richardson, Alexander D'Amour, Jacopo Soriano, Steve Yadlowsky,
- Abstract summary: In many randomized experiments, the treatment effect of the long-term metric is often difficult or infeasible to measure.
A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric.
We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments.
- Score: 54.338884612982405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many randomized experiments, the treatment effect of the long-term metric (i.e. the primary outcome of interest) is often difficult or infeasible to measure. Such long-term metrics are often slow to react to changes and sufficiently noisy they are challenging to faithfully estimate in short-horizon experiments. A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric -- so they can be used to effectively guide decision-making in the near-term. We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments. Our procedure first reduces the construction of an optimal proxy metric in a given experiment to a portfolio optimization problem which depends on the true latent treatment effects and noise level of experiment under consideration. We then denoise the observed treatment effects of the long-term metric and a set of proxies in a historical corpus of randomized experiments to extract estimates of the latent treatment effects for use in the optimization problem. One key insight derived from our approach is that the optimal proxy metric for a given experiment is not apriori fixed; rather it should depend on the sample size (or effective noise level) of the randomized experiment for which it is deployed. To instantiate and evaluate our framework, we employ our methodology in a large corpus of randomized experiments from an industrial recommendation system and construct proxy metrics that perform favorably relative to several baselines.
Related papers
- Experimenting on Markov Decision Processes with Local Treatments [13.182388658918502]
We investigate the randomized experiments within dynamical systems modeled as Markov Decision Processes (MDPs)
Our goal is to assess the impact of treatment and control policies on long-term cumulative rewards from relatively short-term observations.
arXiv Detail & Related papers (2024-07-29T00:41:11Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables.
In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy.
Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z) - Pareto optimal proxy metrics [62.997667081978825]
We show that proxy metrics are eight times more sensitive than the north star and consistently moved in the same direction.
We apply our methodology to experiments from a large industrial recommendation system.
arXiv Detail & Related papers (2023-07-03T13:29:14Z) - A Reinforcement Learning Approach to Estimating Long-term Treatment
Effects [13.371851720834918]
A limitation with randomized experiments is that they do not easily extend to measure long-term effects.
We take a reinforcement learning (RL) approach that estimates the average reward in a Markov process.
Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems.
arXiv Detail & Related papers (2022-10-14T05:33:19Z) - Local Evaluation of Time Series Anomaly Detection Algorithms [9.717823994163277]
We show that an adversary algorithm can reach high precision and recall on almost any dataset under weak assumption.
We propose a theoretically grounded, robust, parameter-free and interpretable extension to precision/recall metrics.
arXiv Detail & Related papers (2022-06-27T10:18:41Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Adaptive neighborhood Metric learning [184.95321334661898]
We propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML)
ANML can be used to learn both the linear and deep embeddings.
The emphlog-exp mean function proposed in our method gives a new perspective to review the deep metric learning methods.
arXiv Detail & Related papers (2022-01-20T17:26:37Z) - Efficient Adaptive Experimental Design for Average Treatment Effect
Estimation [18.027128141189355]
We propose an algorithm for efficient experiments with estimators constructed from dependent samples.
To justify our proposed approach, we provide finite and infinite sample analyses.
arXiv Detail & Related papers (2020-02-13T02:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.