Causal Inference Struggles with Agency on Online Platforms
- URL: http://arxiv.org/abs/2107.08995v1
- Date: Mon, 19 Jul 2021 16:14:00 GMT
- Title: Causal Inference Struggles with Agency on Online Platforms
- Authors: Smitha Milli, Luca Belli, Moritz Hardt
- Abstract summary: We conduct four large-scale within-study comparisons on Twitter aimed at assessing the effectiveness of observational studies derived from user self-selection.
Our results suggest that observational studies derived from user self-selection are a poor alternative to randomized experimentation on online platforms.
- Score: 32.81856583026165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online platforms regularly conduct randomized experiments to understand how
changes to the platform causally affect various outcomes of interest. However,
experimentation on online platforms has been criticized for having, among other
issues, a lack of meaningful oversight and user consent. As platforms give
users greater agency, it becomes possible to conduct observational studies in
which users self-select into the treatment of interest as an alternative to
experiments in which the platform controls whether the user receives treatment
or not. In this paper, we conduct four large-scale within-study comparisons on
Twitter aimed at assessing the effectiveness of observational studies derived
from user self-selection on online platforms. In a within-study comparison,
treatment effects from an observational study are assessed based on how
effectively they replicate results from a randomized experiment with the same
target population. We test the naive difference in group means estimator, exact
matching, regression adjustment, and inverse probability of treatment weighting
while controlling for plausible confounding variables. In all cases, all
observational estimates perform poorly at recovering the ground-truth estimate
from the analogous randomized experiments. In all cases except one, the
observational estimates have the opposite sign of the randomized estimate. Our
results suggest that observational studies derived from user self-selection are
a poor alternative to randomized experimentation on online platforms. In
discussing our results, we postulate "Catch-22"s that suggest that the success
of causal inference in these settings may be at odds with the original
motivations for providing users with greater agency.
Related papers
- Causal Inference from Text: Unveiling Interactions between Variables [20.677407402398405]
Existing methods only account for confounding covariables that affect both treatment and outcome.
This bias arises from insufficient consideration of non-confounding covariables.
In this work, we aim to mitigate the bias by unveiling interactions between different variables.
arXiv Detail & Related papers (2023-11-09T11:29:44Z) - A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies.
Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z) - Fair Effect Attribution in Parallel Online Experiments [57.13281584606437]
A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services.
It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly.
Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes.
arXiv Detail & Related papers (2022-10-15T17:15:51Z) - Avoiding Biased Clinical Machine Learning Model Performance Estimates in
the Presence of Label Selection [3.3944964838781093]
We describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics.
We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%.
Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool.
arXiv Detail & Related papers (2022-09-15T22:30:14Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - Identifying Peer Influence in Therapeutic Communities Adjusting for Latent Homophily [1.6385815610837167]
We investigate peer role model influence on successful graduation from Therapeutic Communities (TCs) for substance abuse and criminal behavior.
To identify peer influence in the presence of unobserved homophily in observational data, we model the network with a latent variable model.
Our results indicate a positive effect of peers' graduation on residents' graduation and that it differs based on gender, race, and the definition of the role model effect.
arXiv Detail & Related papers (2022-03-27T06:47:28Z) - Plinko: A Theory-Free Behavioral Measure of Priors for Statistical
Learning and Mental Model Updating [62.997667081978825]
We present three experiments using "Plinko", a behavioral task in which participants estimate distributions of ball drops over all available outcomes.
We show that participant priors cluster around prototypical probability distributions and that prior cluster membership may indicate learning ability.
We verify that individual participant priors are reliable representations and that learning is not impeded when faced with a physically implausible ball drop distribution.
arXiv Detail & Related papers (2021-07-23T22:27:30Z) - Double machine learning for sample selection models [0.12891210250935145]
This paper considers the evaluation of discretely distributed treatments when outcomes are only observed for a subpopulation due to sample selection or outcome attrition.
We make use of (a) Neyman-orthogonal, doubly robust, and efficient score functions, which imply the robustness of treatment effect estimation to moderate regularization biases in the machine learning-based estimation of the outcome, treatment, or sample selection models and (b) sample splitting (or cross-fitting) to prevent overfitting bias.
arXiv Detail & Related papers (2020-11-30T19:40:21Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.