Spurious Correlations and Where to Find Them
- URL: http://arxiv.org/abs/2308.11043v1
- Date: Mon, 21 Aug 2023 21:06:36 GMT
- Title: Spurious Correlations and Where to Find Them
- Authors: Gautam Sreekumar and Vishnu Naresh Boddeti
- Abstract summary: Spurious correlations occur when a model learns unreliable features from the data.
We collect some of the commonly studied hypotheses behind the occurrence of spurious correlations.
We investigate their influence on standard ERM baselines using synthetic datasets generated from causal graphs.
- Score: 17.1264393170134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spurious correlations occur when a model learns unreliable features from the
data and are a well-known drawback of data-driven learning. Although there are
several algorithms proposed to mitigate it, we are yet to jointly derive the
indicators of spurious correlations. As a result, the solutions built upon
standalone hypotheses fail to beat simple ERM baselines. We collect some of the
commonly studied hypotheses behind the occurrence of spurious correlations and
investigate their influence on standard ERM baselines using synthetic datasets
generated from causal graphs. Subsequently, we observe patterns connecting
these hypotheses and model design choices.
Related papers
- Towards Robust Text Classification: Mitigating Spurious Correlations with Causal Learning [2.7813683000222653]
We propose the Causally Calibrated Robust ( CCR) to reduce models' reliance on spurious correlations.
CCR integrates a causal feature selection method based on counterfactual reasoning, along with an inverse propensity weighting (IPW) loss function.
We show that CCR state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels.
arXiv Detail & Related papers (2024-11-01T21:29:07Z) - Deep Causal Generative Models with Property Control [11.604321459670315]
We propose a novel deep generative framework called the Correlation-aware Causal Variational Auto-encoder (C2VAE)
C2VAE simultaneously recovers the correlation and causal relationships between properties using disentangled latent vectors.
arXiv Detail & Related papers (2024-05-25T13:07:27Z) - Sample, estimate, aggregate: A recipe for causal discovery foundation models [28.116832159265964]
We train a supervised model that learns to predict a larger causal graph from the outputs of classical causal discovery algorithms run over subsets of variables.
Our approach is enabled by the observation that typical errors in the outputs of classical methods remain comparable across datasets.
Experiments on real and synthetic data demonstrate that this model maintains high accuracy in the face of misspecification or distribution shift.
arXiv Detail & Related papers (2024-02-02T21:57:58Z) - Discovering Mixtures of Structural Causal Models from Time Series Data [23.18511951330646]
We propose a general variational inference-based framework called MCD to infer the underlying causal models.
Our approach employs an end-to-end training process that maximizes an evidence-lower bound for the data likelihood.
We demonstrate that our method surpasses state-of-the-art benchmarks in causal discovery tasks.
arXiv Detail & Related papers (2023-10-10T05:13:10Z) - Causal Reasoning in the Presence of Latent Confounders via Neural ADMG
Learning [8.649109147825985]
Latent confounding has been a long-standing obstacle for causal reasoning from observational data.
We propose a novel neural causal model based on autoregressive flows for ADMG learning.
arXiv Detail & Related papers (2023-03-22T16:45:54Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Amortized Causal Discovery: Learning to Infer Causal Graphs from
Time-Series Data [63.15776078733762]
We propose Amortized Causal Discovery, a novel framework to learn to infer causal relations from time-series data.
We demonstrate experimentally that this approach, implemented as a variational model, leads to significant improvements in causal discovery performance.
arXiv Detail & Related papers (2020-06-18T19:59:12Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions.
One solution for achieving strong generalization is to incorporate causal structures in the models.
We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.