dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data
- URL: http://arxiv.org/abs/2505.06542v1
- Date: Sat, 10 May 2025 07:05:19 GMT
- Title: dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data
- Authors: Adèle H. Ribeiro, Dominik Heider,
- Abstract summary: We introduce the first nonparametric score to assess a Partial Ancestral Graph's compatibility with observed data.<n>We then propose data-compatible Fast Causal Inference (dcFCI) to jointly address latent confounding, empirical unfaithfulness, and mixed data types.
- Score: 1.9797215742507548
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Causal discovery is central to inferring causal relationships from observational data. In the presence of latent confounding, algorithms such as Fast Causal Inference (FCI) learn a Partial Ancestral Graph (PAG) representing the true model's Markov Equivalence Class. However, their correctness critically depends on empirical faithfulness, the assumption that observed (in)dependencies perfectly reflect those of the underlying causal model, which often fails in practice due to limited sample sizes. To address this, we introduce the first nonparametric score to assess a PAG's compatibility with observed data, even with mixed variable types. This score is both necessary and sufficient to characterize structural uncertainty and distinguish between distinct PAGs. We then propose data-compatible FCI (dcFCI), the first hybrid causal discovery algorithm to jointly address latent confounding, empirical unfaithfulness, and mixed data types. dcFCI integrates our score into an (Anytime)FCI-guided search that systematically explores, ranks, and validates candidate PAGs. Experiments on synthetic and real-world scenarios demonstrate that dcFCI significantly outperforms state-of-the-art methods, often recovering the true PAG even in small and heterogeneous datasets. Examining top-ranked PAGs further provides valuable insights into structural uncertainty, supporting more robust and informed causal reasoning and decision-making.
Related papers
- Relational Causal Discovery with Latent Confounders [16.33251047653638]
We propose RelFCI, a complete causal discovery algorithm for relational data with latent confounders.<n>We present results demonstrating the effectiveness of RelFCI in identifying the correct causal structure in relational causal models with latent confounders.
arXiv Detail & Related papers (2025-07-02T13:29:35Z) - Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents [64.43980129731587]
We propose a causal-inspired inference-time debiasing method called Causal Diagnosis and Correction (CDC)<n>CDC first diagnoses the bias effect of the perplexity and then separates the bias effect from the overall relevance score.<n> Experimental results across three domains demonstrate the superior debiasing effectiveness.
arXiv Detail & Related papers (2025-03-11T17:59:00Z) - Robust Time Series Causal Discovery for Agent-Based Model Validation [5.430532390358285]
This study proposes a Robust Cross-Validation (RCV) approach to enhance causal structure learning for ABM validation.
We develop RCV-VarLiNGAM and RCV-PCMCI, novel extensions of two prominent causal discovery algorithms.
The proposed approach is then integrated into an enhanced ABM validation framework.
arXiv Detail & Related papers (2024-10-25T09:13:26Z) - Evidential Deep Partial Multi-View Classification With Discount Fusion [24.139495744683128]
We propose a novel framework called Evidential Deep Partial Multi-View Classification (EDP-MVC)
We use K-means imputation to address missing views, creating a complete set of multi-view data.
The potential conflicts and uncertainties within this imputed data can affect the reliability of downstream inferences.
arXiv Detail & Related papers (2024-08-23T14:50:49Z) - DAGnosis: Localized Identification of Data Inconsistencies using
Structures [73.39285449012255]
Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models.
We use directed acyclic graphs (DAGs) to encode the training set's features probability distribution and independencies as a structure.
Our method, called DAGnosis, leverages these structural interactions to bring valuable and insightful data-centric conclusions.
arXiv Detail & Related papers (2024-02-26T11:29:16Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Towards Practical Federated Causal Structure Learning [9.74796970978203]
FedC2SL is a constraint-based causal structure learning scheme that learns causal graphs using a conditional independence test.
The study evaluates FedC2SL using both synthetic datasets and real-world data against existing solutions.
arXiv Detail & Related papers (2023-06-15T18:23:58Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Biases in Inverse Ising Estimates of Near-Critical Behaviour [0.0]
Inverse inference allows pairwise interactions to be reconstructed from empirical correlations.
We show that estimators used for this inference, such as Pseudo-likelihood (PLM), are biased.
Data-driven methods are explored and applied to a functional magnetic resonance imaging (fMRI) dataset from neuroscience.
arXiv Detail & Related papers (2023-01-13T14:01:43Z) - Conditional Feature Importance for Mixed Data [1.6114012813668934]
We develop a conditional predictive impact (CPI) framework with knockoff sampling.
We show that our proposed workflow controls type I error, achieves high power and is in line with results given by other conditional FI measures.
Our findings highlight the necessity of developing statistically adequate, specialized methods for mixed data.
arXiv Detail & Related papers (2022-10-06T16:52:38Z) - Federated Causal Discovery [74.37739054932733]
This paper develops a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD)
It can learn the causal graph without directly touching local data and naturally handle the data heterogeneity.
Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
arXiv Detail & Related papers (2021-12-07T08:04:12Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.