Direct-Effect Risk Minimization for Domain Generalization
- URL: http://arxiv.org/abs/2211.14594v1
- Date: Sat, 26 Nov 2022 15:35:36 GMT
- Title: Direct-Effect Risk Minimization for Domain Generalization
- Authors: Yuhui Li, Zejia Wu, Chao Zhang, Hongyang Zhang
- Abstract summary: We introduce the concepts of direct and indirect effects from causal inference to the domain generalization problem.
We argue that models that learn direct effects minimize the worst-case risk across correlation-shifted domains.
Experiments on 5 correlation-shifted datasets and the DomainBed benchmark verify the effectiveness of our approach.
- Score: 11.105832297850188
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of out-of-distribution (o.o.d.) generalization where
spurious correlations of attributes vary across training and test domains. This
is known as the problem of correlation shift and has posed concerns on the
reliability of machine learning. In this work, we introduce the concepts of
direct and indirect effects from causal inference to the domain generalization
problem. We argue that models that learn direct effects minimize the worst-case
risk across correlation-shifted domains. To eliminate the indirect effects, our
algorithm consists of two stages: in the first stage, we learn an
indirect-effect representation by minimizing the prediction error of domain
labels using the representation and the class label; in the second stage, we
remove the indirect effects learned in the first stage by matching each data
with another data of similar indirect-effect representation but of different
class label. We also propose a new model selection method by matching the
validation set in the same way, which is shown to improve the generalization
performance of existing models on correlation-shifted datasets. Experiments on
5 correlation-shifted datasets and the DomainBed benchmark verify the
effectiveness of our approach.
Related papers
- SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Algorithms and Theory for Supervised Gradual Domain Adaptation [19.42476993856205]
We study the problem of supervised gradual domain adaptation, where labeled data from shifting distributions are available to the learner along the trajectory.
Under this setting, we provide the first generalization upper bound on the learning error under mild assumptions.
Our results are algorithm agnostic for a range of loss functions, and only depend linearly on the averaged learning error across the trajectory.
arXiv Detail & Related papers (2022-04-25T13:26:11Z) - Disentanglement and Generalization Under Correlation Shifts [22.499106910581958]
Correlations between factors of variation are prevalent in real-world data.
Machine learning algorithms may benefit from exploiting such correlations, as they can increase predictive performance on noisy data.
We aim to learn representations which capture different factors of variation in latent subspaces.
arXiv Detail & Related papers (2021-12-29T18:55:17Z) - Instrumental Variable-Driven Domain Generalization with Unobserved
Confounders [53.735614014067394]
Domain generalization (DG) aims to learn from multiple source domains a model that can generalize well on unseen target domains.
We propose an instrumental variable-driven DG method (IV-DG) by removing the bias of the unobserved confounders with two-stage learning.
In the first stage, it learns the conditional distribution of the input features of one domain given input features of another domain.
In the second stage, it estimates the relationship by predicting labels with the learned conditional distribution.
arXiv Detail & Related papers (2021-10-04T13:32:57Z) - SelfReg: Self-supervised Contrastive Regularization for Domain
Generalization [7.512471799525974]
We propose a new regularization method for domain generalization based on contrastive learning, self-supervised contrastive regularization (SelfReg)
The proposed approach use only positive data pairs, thus it resolves various problems caused by negative pair sampling.
In the recent benchmark, DomainBed, the proposed method shows comparable performance to the conventional state-of-the-art alternatives.
arXiv Detail & Related papers (2021-04-20T09:08:29Z) - Domain Adaptative Causality Encoder [52.779274858332656]
We leverage the characteristics of dependency trees and adversarial learning to address the tasks of adaptive causality identification and localisation.
We present a new causality dataset, namely MedCaus, which integrates all types of causality in the text.
arXiv Detail & Related papers (2020-11-27T04:14:55Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation [66.74638960925854]
Partial domain adaptation (PDA) deals with a realistic and challenging problem when the source domain label space substitutes the target domain.
We propose an Adaptively-Accumulated Knowledge Transfer framework (A$2$KT) to align the relevant categories across two domains.
arXiv Detail & Related papers (2020-08-27T00:53:43Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z) - Selecting Data Augmentation for Simulating Interventions [12.848239550098693]
Machine learning models trained with purely observational data and the principle of empirical risk fail to generalize to unseen domains.
We argue that causal concepts can be used to explain the success of data augmentation by describing how they can weaken the spurious correlation between the observed domains and the task labels.
arXiv Detail & Related papers (2020-05-04T21:33:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.