Selecting Data Augmentation for Simulating Interventions
- URL: http://arxiv.org/abs/2005.01856v4
- Date: Mon, 26 Oct 2020 10:52:21 GMT
- Title: Selecting Data Augmentation for Simulating Interventions
- Authors: Maximilian Ilse, Jakub M. Tomczak, Patrick Forr\'e
- Abstract summary: Machine learning models trained with purely observational data and the principle of empirical risk fail to generalize to unseen domains.
We argue that causal concepts can be used to explain the success of data augmentation by describing how they can weaken the spurious correlation between the observed domains and the task labels.
- Score: 12.848239550098693
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models trained with purely observational data and the
principle of empirical risk minimization \citep{vapnik_principles_1992} can
fail to generalize to unseen domains. In this paper, we focus on the case where
the problem arises through spurious correlation between the observed domains
and the actual task labels. We find that many domain generalization methods do
not explicitly take this spurious correlation into account. Instead, especially
in more application-oriented research areas like medical imaging or robotics,
data augmentation techniques that are based on heuristics are used to learn
domain invariant features. To bridge the gap between theory and practice, we
develop a causal perspective on the problem of domain generalization. We argue
that causal concepts can be used to explain the success of data augmentation by
describing how they can weaken the spurious correlation between the observed
domains and the task labels. We demonstrate that data augmentation can serve as
a tool for simulating interventional data. We use these theoretical insights to
derive a simple algorithm that is able to select data augmentation techniques
that will lead to better domain generalization.
Related papers
- Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm [14.980926991441345]
We show that datasets containing interventional data can be effectively extracted under realistic assumptions about the data distribution.
We introduce interventional faithfulness, which relies on comparisons between the marginal distributions of each variable across observational and interventional settings.
We also introduce Intersort, an algorithm designed to infer the causal order from datasets containing large numbers of single-variable interventions.
arXiv Detail & Related papers (2024-05-28T16:07:17Z) - DIGIC: Domain Generalizable Imitation Learning by Causal Discovery [69.13526582209165]
Causality has been combined with machine learning to produce robust representations for domain generalization.
We make a different attempt by leveraging the demonstration data distribution to discover causal features for a domain generalizable policy.
We design a novel framework, called DIGIC, to identify the causal features by finding the direct cause of the expert action from the demonstration data distribution.
arXiv Detail & Related papers (2024-02-29T07:09:01Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Domain-aware Triplet loss in Domain Generalization [0.0]
Domain shift is caused by discrepancies in the distributions of the testing and training data.
We design a domainaware triplet loss for domain generalization to help the model to cluster similar semantic features.
Our algorithm is designed to disperse domain information in the embedding space.
arXiv Detail & Related papers (2023-03-01T14:02:01Z) - Direct-Effect Risk Minimization for Domain Generalization [11.105832297850188]
We introduce the concepts of direct and indirect effects from causal inference to the domain generalization problem.
We argue that models that learn direct effects minimize the worst-case risk across correlation-shifted domains.
Experiments on 5 correlation-shifted datasets and the DomainBed benchmark verify the effectiveness of our approach.
arXiv Detail & Related papers (2022-11-26T15:35:36Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Domain-Class Correlation Decomposition for Generalizable Person
Re-Identification [34.813965300584776]
In person re-identification, the domain and class are correlated.
We show that domain adversarial learning will lose certain information about class due to this domain-class correlation.
Our model outperforms the state-of-the-art methods on the large-scale domain generalization Re-ID benchmark.
arXiv Detail & Related papers (2021-06-29T09:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.