Local Causal Discovery for Statistically Efficient Causal Inference
- URL: http://arxiv.org/abs/2510.14582v1
- Date: Thu, 16 Oct 2025 11:39:02 GMT
- Title: Local Causal Discovery for Statistically Efficient Causal Inference
- Authors: Mátyás Schubert, Tom Claassen, Sara Magliacane,
- Abstract summary: Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables.<n>Global causal discovery methods focus on learning the whole causal graph and enable the recovery of optimal adjustment sets.<n>Local causal discovery methods offer a more scalable alternative by focusing on the local neighborhood of the target variables.
- Score: 7.856998585396421
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables, even when the underlying causal graph is unknown. Global causal discovery methods focus on learning the whole causal graph and therefore enable the recovery of optimal adjustment sets, i.e., sets with the lowest asymptotic variance, but they quickly become computationally prohibitive as the number of variables grows. Local causal discovery methods offer a more scalable alternative by focusing on the local neighborhood of the target variables, but are restricted to statistically suboptimal adjustment sets. In this work, we propose Local Optimal Adjustments Discovery (LOAD), a sound and complete causal discovery approach that combines the computational efficiency of local methods with the statistical optimality of global methods. First, LOAD identifies the causal relation between the targets and tests if the causal effect is identifiable by using only local information. If it is identifiable, it then finds the optimal adjustment set by leveraging local causal discovery to infer the mediators and their parents. Otherwise, it returns the locally valid parent adjustment sets based on the learned local structure. In our experiments on synthetic and realistic data LOAD outperforms global methods in scalability, while providing more accurate effect estimation than local methods.
Related papers
- Can Large Language Models Help Experimental Design for Causal Discovery? [94.66802142727883]
Large Language Model Guided Intervention Targeting (LeGIT) is a robust framework that effectively incorporates LLMs to augment existing numerical approaches for the intervention targeting in causal discovery.<n>LeGIT demonstrates significant improvements and robustness over existing methods and even surpasses humans.
arXiv Detail & Related papers (2025-03-03T03:43:05Z) - Local Causal Structure Learning in the Presence of Latent Variables [16.88791886307876]
We present a principled method for determining whether a variable is a direct cause or effect of a target.
Experimental results on both synthetic and real-world data validate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2024-05-25T13:31:05Z) - Local Discovery by Partitioning: Polynomial-Time Causal Discovery Around Exposure-Outcome Pairs [18.31538168213386]
We propose local discovery by partitioning (LDP) for causal inference tasks.
LDP is a constraint-based procedure that returns a VAS for an exposure-outcome pair under latent confounding.
Adjustment sets from LDP yield less biased and more precise average treatment effect estimates than baseline discovery algorithms.
arXiv Detail & Related papers (2023-10-25T14:53:10Z) - RHALE: Robust and Heterogeneity-aware Accumulated Local Effects [8.868822699365616]
Accumulated Local Effects (ALE) is a widely-used explainability method for isolating the average effect of a feature on the output.
It does not quantify the deviation of instance-level (local) effects from the average (global) effect, known as heterogeneity.
We propose Robust and Heterogeneity-aware ALE (RHALE) to address these limitations.
arXiv Detail & Related papers (2023-09-20T10:27:41Z) - Structural restrictions in local causal discovery: identifying direct causes of a target variable [0.9208007322096533]
Learning a set of direct causes of a target variable from an observational joint distribution is a fundamental problem in science.<n>Here, we are only interested in identifying the direct causes of one target variable, not the full DAG.<n>This allows us to relax the identifiability assumptions and develop possibly faster and more robust algorithms.
arXiv Detail & Related papers (2023-07-29T18:31:35Z) - Local Causal Discovery for Estimating Causal Effects [41.49486724979923]
Local Discovery using Eager Collider Checks (LDECC)
We introduce Local Discovery using Eager Collider Checks (LDECC)
We show that LDECC and existing algorithms rely on different faithfulness assumptions, leveraging this insight to weaken the assumptions for identifying the set of possible ATE values.
arXiv Detail & Related papers (2023-02-16T04:12:34Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap.
We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.