Clustering-based Imputation for Dropout Buyers in Large-scale Online
Experimentation
- URL: http://arxiv.org/abs/2209.06125v3
- Date: Fri, 7 Apr 2023 15:21:43 GMT
- Title: Clustering-based Imputation for Dropout Buyers in Large-scale Online
Experimentation
- Authors: Sumin Shen, Huiying Mao, Zezhong Zhang, Zili Chen, Keyu Nie, Xinwei
Deng
- Abstract summary: In online experimentation, appropriate metrics (e.g., purchase) provide strong evidence to support hypotheses and enhance the decision-making process.
In this work, we introduce the concept of dropout buyers and categorize users with incomplete metric values into two groups: visitors and dropout buyers.
For the analysis of incomplete metrics, we propose a clustering-based imputation method using $k$-nearest neighbors.
- Score: 4.753069295451989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In online experimentation, appropriate metrics (e.g., purchase) provide
strong evidence to support hypotheses and enhance the decision-making process.
However, incomplete metrics are frequently occurred in the online
experimentation, making the available data to be much fewer than the planned
online experiments (e.g., A/B testing). In this work, we introduce the concept
of dropout buyers and categorize users with incomplete metric values into two
groups: visitors and dropout buyers. For the analysis of incomplete metrics, we
propose a clustering-based imputation method using $k$-nearest neighbors. Our
proposed imputation method considers both the experiment-specific features and
users' activities along their shopping paths, allowing different imputation
values for different users. To facilitate efficient imputation of large-scale
data sets in online experimentation, the proposed method uses a combination of
stratification and clustering. The performance of the proposed method is
compared to several conventional methods in both simulation studies and a real
online experiment at eBay.
Related papers
- Data Distribution Valuation [56.71023681599737]
Existing data valuation methods define a value for a discrete dataset.
In many use cases, users are interested in not only the value of the dataset, but that of the distribution from which the dataset was sampled.
We propose a maximum mean discrepancy (MMD)-based valuation method which enables theoretically principled and actionable policies.
arXiv Detail & Related papers (2024-10-06T07:56:53Z) - Powerful A/B-Testing Metrics and Where to Find Them [11.018341970786574]
A/B-tests are the bread and butter of real-world recommender system evaluation.
A North Star metric is used to assess which system variant should be deemed superior.
We propose to collect this information and leverage it to quantify type-I, type-II, and type-III errors for the metrics of interest.
We present results and insights from building this pipeline at scale for two large-scale short-video platforms: ShareChat and Moj.
arXiv Detail & Related papers (2024-07-30T08:59:50Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - How to Leverage Diverse Demonstrations in Offline Imitation Learning [39.24627312800116]
Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data.
We introduce a simple yet effective data selection method that identifies positive behaviors based on their resultant states.
We then devise a lightweight behavior cloning algorithm capable of leveraging the expert and selected data correctly.
arXiv Detail & Related papers (2024-05-24T04:56:39Z) - Variance Reduction in Ratio Metrics for Efficient Online Experiments [12.036747050794135]
We apply variance reduction techniques to ratio metrics on a large-scale short-video platform: ShareChat.
Our results show that we can either improve A/B-test confidence in 77% of cases, or can retain the same level of confidence with 30% fewer data points.
arXiv Detail & Related papers (2024-01-08T18:01:09Z) - Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches [13.504353263032359]
The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency.
Traditionally, experimenters determine AES based on domain knowledge, but this method becomes impractical for online experimentation services managing numerous experiments.
We propose two solutions for data-driven AES selection in for online experimentation services.
arXiv Detail & Related papers (2023-12-20T09:34:28Z) - Choosing a Proxy Metric from Past Experiments [54.338884612982405]
In many randomized experiments, the treatment effect of the long-term metric is often difficult or infeasible to measure.
A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric.
We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments.
arXiv Detail & Related papers (2023-09-14T17:43:02Z) - Fair Effect Attribution in Parallel Online Experiments [57.13281584606437]
A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services.
It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly.
Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes.
arXiv Detail & Related papers (2022-10-15T17:15:51Z) - A Recommendation Approach based on Similarity-Popularity Models of
Complex Networks [1.385805101975528]
This work proposes a novel recommendation method based on complex networks generated by a similarity-popularity model to predict ones.
We first construct a model of a network having users and items as nodes from observed ratings and then use it to predict unseen ratings.
The proposed approach is implemented and experimentally compared against baseline and state-of-the-art recommendation methods on 21 datasets from various domains.
arXiv Detail & Related papers (2022-09-29T11:00:06Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.