Causal Identification from Counterfactual Data: Completeness and Bounding Results
- URL: http://arxiv.org/abs/2602.23541v2
- Date: Tue, 03 Mar 2026 20:56:54 GMT
- Title: Causal Identification from Counterfactual Data: Completeness and Bounding Results
- Authors: Arvind Raghavan, Elias Bareinboim,
- Abstract summary: We develop an algorithm for identifying counterfactual queries from an arbitrary set of Layer 3 distributions.<n>We establish the theoretical limit of which counterfactuals can be identified from physically realizable distributions.<n>We derive novel analytic bounds for such quantities using realizable counterfactual data.
- Score: 54.147490305295456
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Previous work establishing completeness results for counterfactual identification has been circumscribed to the setting where the input data belongs to observational or interventional distributions (Layers 1 and 2 of Pearl's Causal Hierarchy), since it was generally presumed impossible to obtain data from counterfactual distributions, which belong to Layer 3. However, recent work (Raghavan & Bareinboim, 2025) has formally characterized a family of counterfactual distributions which can be directly estimated via experimental methods - a notion they call counterfactual realizabilty. This leaves open the question of what additional counterfactual quantities now become identifiable, given this new access to (some) Layer 3 data. To answer this question, we develop the CTFIDU+ algorithm for identifying counterfactual queries from an arbitrary set of Layer 3 distributions, and prove that it is complete for this task. Building on this, we establish the theoretical limit of which counterfactuals can be identified from physically realizable distributions, thus implying the fundamental limit to exact causal inference in the non-parametric setting. Finally, given the impossibility of identifying certain critical types of counterfactuals, we derive novel analytic bounds for such quantities using realizable counterfactual data, and corroborate using simulations that counterfactual data helps tighten the bounds for non-identifiable quantities in practice.
Related papers
- Data-Driven Information-Theoretic Causal Bounds under Unmeasured Confounding [10.590231532335691]
We develop a data-driven information-theoretic framework for partial identification of conditional causal effects under unmeasured confounding.<n>Our key theoretical contribution shows that the f-divergence between the observational distribution P(Y | A = a, X = x) and the interventional distribution P(Y | do(A = a), X = x) is upper bounded by a function of the propensity score alone.<n>This result enables sharp partial identification of conditional causal effects directly from observational data, without requiring external sensitivity parameters, auxiliary variables, full structural specifications, or outcome boundedness assumptions.
arXiv Detail & Related papers (2026-01-23T20:47:48Z) - Likelihood-Preserving Embeddings for Statistical Inference [0.0]
Modern machine learning embeddings provide powerful compression of high-dimensional data.<n>This paper develops a theory of likelihood-preserving embeddings.<n>Experiments on Gaussian and Cauchy distributions validate the sharp phase transition predicted by exponential family theory.
arXiv Detail & Related papers (2025-12-27T16:21:55Z) - Counterfactual Identifiability via Dynamic Optimal Transport [15.637845261800463]
We argue that counterfactuals must be identifiable to justify causal claims.<n>A recent line of work on counterfactual inference shows promising results but lacks identification.
arXiv Detail & Related papers (2025-10-09T14:45:13Z) - Counterfactual Realizability [52.85109506684737]
We introduce a formal definition of realizability, the ability to draw samples from a distribution, and then develop a complete algorithm to determine whether an arbitrary counterfactual distribution is realizable.<n>We illustrate the implications of this new framework for counterfactual data collection using motivating examples from causal fairness and causal reinforcement learning.
arXiv Detail & Related papers (2025-03-14T20:54:27Z) - DAGnosis: Localized Identification of Data Inconsistencies using
Structures [73.39285449012255]
Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models.
We use directed acyclic graphs (DAGs) to encode the training set's features probability distribution and independencies as a structure.
Our method, called DAGnosis, leverages these structural interactions to bring valuable and insightful data-centric conclusions.
arXiv Detail & Related papers (2024-02-26T11:29:16Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Deep Counterfactual Estimation with Categorical Background Variables [3.04585143845864]
Counterfactual queries typically ask the "What if?" question retrospectively.
We introduce CounterFactual Query Prediction (CFQP), a novel method to infer counterfactuals from continuous observations.
Our method significantly outperforms previously available deep-learning-based counterfactual methods.
arXiv Detail & Related papers (2022-10-11T22:27:11Z) - Bounding Counterfactuals under Selection Bias [60.55840896782637]
We propose a first algorithm to address both identifiable and unidentifiable queries.
We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal.
arXiv Detail & Related papers (2022-07-26T10:33:10Z) - Nested Counterfactual Identification from Arbitrary Surrogate
Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments.
Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.