A Reinforcement Learning Approach to Estimating Long-term Treatment
Effects
- URL: http://arxiv.org/abs/2210.07536v1
- Date: Fri, 14 Oct 2022 05:33:19 GMT
- Title: A Reinforcement Learning Approach to Estimating Long-term Treatment
Effects
- Authors: Ziyang Tang, Yiheng Duan, Stephanie Zhang, Lihong Li
- Abstract summary: A limitation with randomized experiments is that they do not easily extend to measure long-term effects.
We take a reinforcement learning (RL) approach that estimates the average reward in a Markov process.
Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems.
- Score: 13.371851720834918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Randomized experiments (a.k.a. A/B tests) are a powerful tool for estimating
treatment effects, to inform decisions making in business, healthcare and other
applications. In many problems, the treatment has a lasting effect that evolves
over time. A limitation with randomized experiments is that they do not easily
extend to measure long-term effects, since running long experiments is
time-consuming and expensive. In this paper, we take a reinforcement learning
(RL) approach that estimates the average reward in a Markov process. Motivated
by real-world scenarios where the observed state transition is nonstationary,
we develop a new algorithm for a class of nonstationary problems, and
demonstrate promising results in two synthetic datasets and one online store
dataset.
Related papers
- Experimenting on Markov Decision Processes with Local Treatments [13.182388658918502]
We investigate the randomized experiments within dynamical systems modeled as Markov Decision Processes (MDPs)
Our goal is to assess the impact of treatment and control policies on long-term cumulative rewards from relatively short-term observations.
arXiv Detail & Related papers (2024-07-29T00:41:11Z) - Choosing a Proxy Metric from Past Experiments [54.338884612982405]
In many randomized experiments, the treatment effect of the long-term metric is often difficult or infeasible to measure.
A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric.
We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments.
arXiv Detail & Related papers (2023-09-14T17:43:02Z) - Accounting For Informative Sampling When Learning to Forecast Treatment
Outcomes Over Time [66.08455276899578]
We show that informative sampling can prohibit accurate estimation of treatment outcomes if not properly accounted for.
We present a general framework for learning treatment outcomes in the presence of informative sampling using inverse intensity-weighting.
We propose a novel method, TESAR-CDE, that instantiates this framework using Neural CDEs.
arXiv Detail & Related papers (2023-06-07T08:51:06Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Estimating long-term causal effects from short-term experiments and
long-term observational data with unobserved confounding [5.854757988966379]
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available.
Our long-term causal effect estimator is obtained by combining regression residuals with short-term experimental outcomes.
arXiv Detail & Related papers (2023-02-21T12:22:47Z) - Long-term Causal Inference Under Persistent Confounding via Data Combination [38.026740610259225]
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available.
Since the long-term outcome is observed only after a long delay, it is not measured in the experimental data, but only recorded in the observational data.
arXiv Detail & Related papers (2022-02-15T07:44:20Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Counterfactual Propagation for Semi-Supervised Individual Treatment
Effect Estimation [21.285425135761795]
Individual treatment effect (ITE) represents the expected improvement in the outcome of taking a particular action to a particular target.
In this study, we consider a semi-supervised ITE estimation problem that exploits more easily-available unlabeled instances.
We propose counterfactual propagation, which is the first semi-supervised ITE estimation method.
arXiv Detail & Related papers (2020-05-11T13:32:38Z) - Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework [68.96770035057716]
A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.
This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
arXiv Detail & Related papers (2020-02-05T10:25:02Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.