Identifying counterfactual probabilities using bivariate distributions and uplift modeling
- URL: http://arxiv.org/abs/2512.08805v1
- Date: Tue, 09 Dec 2025 16:59:38 GMT
- Title: Identifying counterfactual probabilities using bivariate distributions and uplift modeling
- Authors: Théo Verhelst, Gianluca Bontempi,
- Abstract summary: Uplift modeling estimates the causal effect of an intervention as the difference between potential outcomes under treatment and control.<n>Counterfactual identification aims to recover the joint distribution of these potential outcomes.
- Score: 1.519321208145928
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Uplift modeling estimates the causal effect of an intervention as the difference between potential outcomes under treatment and control, whereas counterfactual identification aims to recover the joint distribution of these potential outcomes (e.g., "Would this customer still have churned had we given them a marketing offer?"). This joint counterfactual distribution provides richer information than the uplift but is harder to estimate. However, the two approaches are synergistic: uplift models can be leveraged for counterfactual estimation. We propose a counterfactual estimator that fits a bivariate beta distribution to predicted uplift scores, yielding posterior distributions over counterfactual outcomes. Our approach requires no causal assumptions beyond those of uplift modeling. Simulations show the efficacy of the approach, which can be applied, for example, to the problem of customer churn in telecom, where it reveals insights unavailable to standard ML or uplift models alone.
Related papers
- Causal Inference under Threshold Manipulation: Bayesian Mixture Modeling and Heterogeneous Treatment Effects [0.25782420501870296]
We propose a novel framework for estimating the causal effect under threshold manipulation.<n>The main idea is to model the observed spending distribution as a mixture of two distributions.<n>We show posterior contraction of the causal effect under large samples.
arXiv Detail & Related papers (2025-09-24T06:52:53Z) - Counterfactual Realizability [52.85109506684737]
We introduce a formal definition of realizability, the ability to draw samples from a distribution, and then develop a complete algorithm to determine whether an arbitrary counterfactual distribution is realizable.<n>We illustrate the implications of this new framework for counterfactual data collection using motivating examples from causal fairness and causal reinforcement learning.
arXiv Detail & Related papers (2025-03-14T20:54:27Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Bayesian Hierarchical Models for Counterfactual Estimation [12.159830463756341]
We propose a probabilistic paradigm to estimate a diverse set of counterfactuals.
We treat the perturbations as random variables endowed with prior distribution functions.
A gradient based sampler with superior convergence characteristics efficiently computes the posterior samples.
arXiv Detail & Related papers (2023-01-21T00:21:11Z) - Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation [10.75801980090826]
We present three novel Bayesian methods to estimate the expectation of the ultimate treatment effect.
These methods differ on the source of uncertainty considered and allow for combining two sources of data.
We generalize these ideas to the off-policy evaluation framework.
arXiv Detail & Related papers (2022-11-02T23:39:36Z) - Mandoline: Model Evaluation under Distribution Shift [8.007644303175395]
Machine learning models are often deployed in different settings than they were trained and validated on.
We develop Mandoline, a new evaluation framework that mitigates these issues.
Users write simple "slicing functions" - noisy, potentially correlated binary functions intended to capture possible axes of distribution shift.
arXiv Detail & Related papers (2021-07-01T17:57:57Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - When Does Uncertainty Matter?: Understanding the Impact of Predictive
Uncertainty in ML Assisted Decision Making [68.19284302320146]
We carry out user studies to assess how people with differing levels of expertise respond to different types of predictive uncertainty.
We found that showing posterior predictive distributions led to smaller disagreements with the ML model's predictions.
This suggests that posterior predictive distributions can potentially serve as useful decision aids which should be used with caution and take into account the type of distribution and the expertise of the human.
arXiv Detail & Related papers (2020-11-12T02:23:53Z) - Model updating after interventions paradoxically introduces bias [2.089458396525051]
Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome.
In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced.
We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process.
arXiv Detail & Related papers (2020-10-22T08:43:29Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.