Detecting hidden confounding in observational data using multiple
environments
- URL: http://arxiv.org/abs/2205.13935v4
- Date: Fri, 3 Nov 2023 19:29:03 GMT
- Title: Detecting hidden confounding in observational data using multiple
environments
- Authors: Rickard K.A. Karlsson, Jesse H. Krijthe
- Abstract summary: We present a theory for testable conditional independencies that are only absent when there is hidden confounding.
In most cases, the proposed procedure correctly predicts the presence of hidden confounding.
- Score: 0.81585306387285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A common assumption in causal inference from observational data is that there
is no hidden confounding. Yet it is, in general, impossible to verify this
assumption from a single dataset. Under the assumption of independent causal
mechanisms underlying the data-generating process, we demonstrate a way to
detect unobserved confounders when having multiple observational datasets
coming from different environments. We present a theory for testable
conditional independencies that are only absent when there is hidden
confounding and examine cases where we violate its assumptions: degenerate &
dependent mechanisms, and faithfulness violations. Additionally, we propose a
procedure to test these independencies and study its empirical finite-sample
behavior using simulation studies and semi-synthetic data based on a real-world
dataset. In most cases, the proposed procedure correctly predicts the presence
of hidden confounding, particularly when the confounding bias is large.
Related papers
- Anomaly Detection by Context Contrasting [57.695202846009714]
Anomaly detection focuses on identifying samples that deviate from the norm.
Recent advances in self-supervised learning have shown great promise in this regard.
We propose Con$$, which learns through context augmentations.
arXiv Detail & Related papers (2024-05-29T07:59:06Z) - Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance.
In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data.
We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Probabilistic Learning of Multivariate Time Series with Temporal
Irregularity [25.91078012394032]
temporal irregularities, including nonuniform time intervals and component misalignment.
We develop a conditional flow representation to non-parametrically represent the data distribution, which is typically non-Gaussian.
The broad applicability and superiority of the proposed solution are confirmed by comparing it with existing approaches through ablation studies and testing on real-world datasets.
arXiv Detail & Related papers (2023-06-15T14:08:48Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z) - Adaptive Data Analysis with Correlated Observations [21.969356766737622]
We show that, in some cases, differential privacy guarantees even when there are dependencies within the sample.
We show that the connection between transcript-compression and adaptive data analysis can be extended to the non-iid setting.
arXiv Detail & Related papers (2022-01-21T14:00:30Z) - OR-Net: Pointwise Relational Inference for Data Completion under Partial
Observation [51.083573770706636]
This work uses relational inference to fill in the incomplete data.
We propose Omni-Relational Network (OR-Net) to model the pointwise relativity in two aspects.
arXiv Detail & Related papers (2021-05-02T06:05:54Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.