Related papers: Towards Reliable Social A/B Testing: Spillover-Contained Clustering with Robust Post-Experiment Analysis

Towards Reliable Social A/B Testing: Spillover-Contained Clustering with Robust Post-Experiment Analysis

URL: http://arxiv.org/abs/2602.08569v1
Date: Mon, 09 Feb 2026 12:08:29 GMT
Title: Towards Reliable Social A/B Testing: Spillover-Contained Clustering with Robust Post-Experiment Analysis
Authors: Xu Min, Zhaoxu Yang, Kaixuan Tan, Juan Yan, Xunbin Xiong, Zihao Zhu, Kaiyu Zhu, Fenglin Cui, Yang Yang, Sihua Yang, Jianhui Bu,
Abstract summary: A/B testing is the foundation of decision-making in online platforms, but social products often suffer from network interference.<n>We propose a spillover-contained experimentation framework with two stages.<n>We validate our approach through large-scale social sharing experiments on Kuaishou, a platform serving hundreds of millions of users.
Score: 11.30339991179317
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A/B testing is the foundation of decision-making in online platforms, yet social products often suffer from network interference: user interactions cause treatment effects to spill over into the control group. Such spillovers bias causal estimates and undermine experimental conclusions. Existing approaches face key limitations: user-level randomization ignores network structure, while cluster-based methods often rely on general-purpose clustering that is not tailored for spillover containment and has difficulty balancing unbiasedness and statistical power at scale. We propose a spillover-contained experimentation framework with two stages. In the pre-experiment stage, we build social interaction graphs and introduce a Balanced Louvain algorithm that produces stable, size-balanced clusters while minimizing cross-cluster edges, enabling reliable cluster-based randomization. In the post-experiment stage, we develop a tailored CUPAC estimator that leverages pre-experiment behavioral covariates to reduce the variance induced by cluster-level assignment, thereby improving statistical power. Together, these components provide both structural spillover containment and robust statistical inference. We validate our approach through large-scale social sharing experiments on Kuaishou, a platform serving hundreds of millions of users. Results show that our method substantially reduces spillover and yields more accurate assessments of social strategies than traditional user-level designs, establishing a reliable and scalable framework for networked A/B testing.

Related papers

Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging [18.71249153088185]
Machine learning models in high-stakes applications, such as recidivism prediction and automated personnel selection, often exhibit systematic performance disparities.<n>We propose a novel empirical likelihood-based (EL) framework that constructs robust statistical measures for model performance disparities.
arXiv Detail & Related papers (2026-01-28T05:36:19Z)
Hierarchical Clustering With Confidence [6.479319856992936]
Agglomerative hierarchical clustering is highly sensitive to small perturbations in the data.<n>We show how randomizing hierarchical clustering can be useful not just for measuring stability but also for designing valid hypothesis testing procedures.
arXiv Detail & Related papers (2025-12-06T18:18:20Z)
Can We Validate Counterfactual Estimations in the Presence of General Network Interference? [13.49152464081862]
We introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference.<n>New distribution-preserving network bootstrap generates statistically-valid subpopulations from a single experiment's data.<n>Counterfactual cross-validation procedure adapts the principles of model validation to the unique constraints of causal settings.
arXiv Detail & Related papers (2025-02-03T06:51:04Z)
Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem. Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
Cluster-guided Contrastive Graph Clustering Network [53.16233290797777]
We propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) We construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. To construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples.
arXiv Detail & Related papers (2023-01-03T13:42:38Z)
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation. Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS) In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z)
Learning from Heterogeneous Data Based on Social Interactions over Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions. We show that the. strategy enables the agents to learn consistently under this highly-heterogeneous setting. We show that the. strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z)
Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner. We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results. We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z)
Two-Stage TMLE to Reduce Bias and Improve Efficiency in Cluster Randomized Trials [0.0]
Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals, and measure outcomes on individuals in those groups. Findings are often missing for some individuals within clusters. CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms.
arXiv Detail & Related papers (2021-06-29T21:47:30Z)
Minimizing Interference and Selection Bias in Network Experiment Design [14.696233190562939]
We propose a principled framework for network experiment design which jointly minimizes interference and selection bias. Our experiments on a number of real-world datasets show that our proposed framework leads to significantly lower error in causal effect estimation.
arXiv Detail & Related papers (2020-04-15T17:34:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.