Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework
- URL: http://arxiv.org/abs/2002.01711v6
- Date: Thu, 3 Nov 2022 15:47:27 GMT
- Title: Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework
- Authors: Chengchun Shi, Xiaoyu Wang, Shikai Luo, Hongtu Zhu, Jieping Ye, Rui
Song
- Abstract summary: A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.
This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
- Score: 68.96770035057716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A/B testing, or online experiment is a standard business strategy to compare
a new product with an old one in pharmaceutical, technological, and traditional
industries. Major challenges arise in online experiments of two-sided
marketplace platforms (e.g., Uber) where there is only one unit that receives a
sequence of treatments over time. In those experiments, the treatment at a
given time impacts current outcome as well as future outcomes. The aim of this
paper is to introduce a reinforcement learning framework for carrying A/B
testing in these experiments, while characterizing the long-term treatment
effects. Our proposed testing procedure allows for sequential monitoring and
online updating. It is generally applicable to a variety of treatment designs
in different industries. In addition, we systematically investigate the
theoretical properties (e.g., size and power) of our testing procedure.
Finally, we apply our framework to both simulated data and a real-world data
example obtained from a technological company to illustrate its advantage over
the current practice. A Python implementation of our test is available at
https://github.com/callmespring/CausalRL.
Related papers
- Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems.
We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework.
Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z) - A Common Misassumption in Online Experiments with Machine Learning
Models [1.52292571922932]
We argue that, because variants typically learn using pooled data, a lack of model interference cannot be guaranteed.
We discuss the implications this has for practitioners, and for the research literature.
arXiv Detail & Related papers (2023-04-21T11:36:44Z) - SPOT: Sequential Predictive Modeling of Clinical Trial Outcome with
Meta-Learning [67.8195828626489]
Clinical trials are essential to drug development but time-consuming, costly, and prone to failure.
We propose Sequential Predictive mOdeling of clinical Trial outcome (SPOT) that first identifies trial topics to cluster the multi-sourced trial data into relevant trial topics.
With the consideration of each trial sequence as a task, it uses a meta-learning strategy to achieve a point where the model can rapidly adapt to new tasks with minimal updates.
arXiv Detail & Related papers (2023-04-07T23:04:27Z) - Experimentation Platforms Meet Reinforcement Learning: Bayesian
Sequential Decision-Making for Continuous Monitoring [13.62951379287041]
In this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost.
We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function.
We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.
arXiv Detail & Related papers (2023-04-02T00:59:10Z) - Fair Effect Attribution in Parallel Online Experiments [57.13281584606437]
A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services.
It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly.
Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes.
arXiv Detail & Related papers (2022-10-15T17:15:51Z) - A Reinforcement Learning Approach to Estimating Long-term Treatment
Effects [13.371851720834918]
A limitation with randomized experiments is that they do not easily extend to measure long-term effects.
We take a reinforcement learning (RL) approach that estimates the average reward in a Markov process.
Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems.
arXiv Detail & Related papers (2022-10-14T05:33:19Z) - Benchopt: Reproducible, efficient and collaborative optimization
benchmarks [67.29240500171532]
Benchopt is a framework to automate, reproduce and publish optimization benchmarks in machine learning.
Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments.
arXiv Detail & Related papers (2022-06-27T16:19:24Z) - Towards Continuous Compounding Effects and Agile Practices in
Educational Experimentation [2.7094829962573304]
This paper defines a framework for categorising different experimental processes.
Next generation of education technology successes will be heralded by embracing the full set of processes.
arXiv Detail & Related papers (2021-11-17T13:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.