Efficient Identification of Direct Causal Parents via Invariance and Minimum Error Testing
- URL: http://arxiv.org/abs/2409.12797v1
- Date: Thu, 19 Sep 2024 14:07:31 GMT
- Title: Efficient Identification of Direct Causal Parents via Invariance and Minimum Error Testing
- Authors: Minh Nguyen, Mert R. Sabuncu,
- Abstract summary: We propose MMSE-ICP and fastICP, two approaches which employ an error inequality to address the identifiability problem of ICP.
The inequality states that the minimum prediction error of the predictor using causal parents is the smallest among all predictors.
MMSE-ICP and fastICP not only outperform competitive baselines in many simulations but also achieve state-the-art result on a large scale real data benchmark.
- Score: 8.40704222803588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Invariant causal prediction (ICP) is a popular technique for finding causal parents (direct causes) of a target via exploiting distribution shifts and invariance testing (Peters et al., 2016). However, since ICP needs to run an exponential number of tests and fails to identify parents when distribution shifts only affect a few variables, applying ICP to practical large scale problems is challenging. We propose MMSE-ICP and fastICP, two approaches which employ an error inequality to address the identifiability problem of ICP. The inequality states that the minimum prediction error of the predictor using causal parents is the smallest among all predictors which do not use descendants. fastICP is an efficient approximation tailored for large problems as it exploits the inequality and a heuristic to run fewer tests. MMSE-ICP and fastICP not only outperform competitive baselines in many simulations but also achieve state-of-the-art result on a large scale real data benchmark.
Related papers
- Invariant Causal Prediction with Local Models [52.161513027831646]
We consider the task of identifying the causal parents of a target variable among a set of candidates from observational data.
We introduce a practical method called L-ICP ($textbfL$ocalized $textbfI$nvariant $textbfCa$usal $textbfP$rediction), which is based on a hypothesis test for parent identification using a ratio of minimum and maximum statistics.
arXiv Detail & Related papers (2024-01-10T15:34:42Z) - Probabilistically robust conformal prediction [9.401004747930974]
Conformal prediction (CP) is a framework to quantify uncertainty of machine learning classifiers including deep neural networks.
Almost all the existing work on CP assumes clean testing data and there is not much known about the robustness of CP algorithms.
This paper studies the problem of probabilistically robust conformal prediction (PRCP) which ensures robustness to most perturbations.
arXiv Detail & Related papers (2023-07-31T01:32:06Z) - Predicting Adverse Neonatal Outcomes for Preterm Neonates with
Multi-Task Learning [51.487856868285995]
We first analyze the correlations between three adverse neonatal outcomes and then formulate the diagnosis of multiple neonatal outcomes as a multi-task learning (MTL) problem.
In particular, the MTL framework contains shared hidden layers and multiple task-specific branches.
arXiv Detail & Related papers (2023-03-28T00:44:06Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Invariant Ancestry Search [6.583725235299022]
We introduce the concept of minimal invariance and propose invariant ancestry search (IAS)
In its population version, IAS outputs a set which contains only ancestors of the response and is the output of ICP.
We develop scalable algorithms and perform experiments on simulated and real data.
arXiv Detail & Related papers (2022-02-02T08:28:00Z) - PACMAN: PAC-style bounds accounting for the Mismatch between Accuracy
and Negative log-loss [28.166066663983674]
The ultimate performance of machine learning algorithms for classification tasks is usually measured in terms of the empirical error probability (or accuracy) based on a testing dataset.
For classification tasks, this loss function is often the negative log-loss that leads to the well-known cross-entropy risk.
We introduce an analysis based on point-wise PAC approach over the generalization gap considering the mismatch of testing based on the accuracy metric and training on the negative log-loss.
arXiv Detail & Related papers (2021-12-10T14:00:22Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - Analyzing and Mitigating Interference in Neural Architecture Search [96.60805562853153]
We investigate the interference issue by sampling different child models and calculating the gradient similarity of shared operators.
Inspired by these two observations, we propose two approaches to mitigate the interference.
Our searched architecture outperforms RoBERTa$_rm base$ by 1.1 and 0.6 scores and ELECTRA$_rm base$ by 1.6 and 1.1 scores on the dev and test set of GLUE benchmark.
arXiv Detail & Related papers (2021-08-29T11:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.