Overcoming Shortcut Learning in a Target Domain by Generalizing Basic
Visual Factors from a Source Domain
- URL: http://arxiv.org/abs/2207.10002v1
- Date: Wed, 20 Jul 2022 16:05:32 GMT
- Title: Overcoming Shortcut Learning in a Target Domain by Generalizing Basic
Visual Factors from a Source Domain
- Authors: Piyapat Saranrittichai, Chaithanya Kumar Mummadi, Claudia Blaiotta,
Mauricio Munoz and Volker Fischer
- Abstract summary: Shortcut learning occurs when a deep neural network overly relies on spurious correlations in the training dataset to solve downstream tasks.
We propose a novel approach to mitigate shortcut learning in uncontrolled target domains.
- Score: 7.012240324005977
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Shortcut learning occurs when a deep neural network overly relies on spurious
correlations in the training dataset in order to solve downstream tasks. Prior
works have shown how this impairs the compositional generalization capability
of deep learning models. To address this problem, we propose a novel approach
to mitigate shortcut learning in uncontrolled target domains. Our approach
extends the training set with an additional dataset (the source domain), which
is specifically designed to facilitate learning independent representations of
basic visual factors. We benchmark our idea on synthetic target domains where
we explicitly control shortcut opportunities as well as real-world target
domains. Furthermore, we analyze the effect of different specifications of the
source domain and the network architecture on compositional generalization. Our
main finding is that leveraging data from a source domain is an effective way
to mitigate shortcut learning. By promoting independence across different
factors of variation in the learned representations, networks can learn to
consider only predictive factors and ignore potential shortcut factors during
inference.
Related papers
- Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Bridging Domains with Approximately Shared Features [26.096779584142986]
Multi-source domain adaptation aims to reduce performance degradation when applying machine learning models to unseen domains.
Some advocate for learning invariant features from source domains, while others favor more diverse features.
We propose a statistical framework that distinguishes the utilities of features based on the variance of their correlation to label $y$ across domains.
arXiv Detail & Related papers (2024-03-11T04:25:41Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z) - Meta-causal Learning for Single Domain Generalization [102.53303707563612]
Single domain generalization aims to learn a model from a single training domain (source domain) and apply it to multiple unseen test domains (target domains)
Existing methods focus on expanding the distribution of the training domain to cover the target domains, but without estimating the domain shift between the source and target domains.
We propose a new learning paradigm, namely simulate-analyze-reduce, which first simulates the domain shift by building an auxiliary domain as the target domain, then learns to analyze the causes of domain shift, and finally learns to reduce the domain shift for model adaptation.
arXiv Detail & Related papers (2023-04-07T15:46:38Z) - Learning Good Features to Transfer Across Tasks and Domains [16.05821129333396]
We first show that such knowledge can be shared across tasks by learning a mapping between task-specific deep features in a given domain.
Then, we show that this mapping function, implemented by a neural network, is able to generalize to novel unseen domains.
arXiv Detail & Related papers (2023-01-26T18:49:39Z) - Algorithms and Theory for Supervised Gradual Domain Adaptation [19.42476993856205]
We study the problem of supervised gradual domain adaptation, where labeled data from shifting distributions are available to the learner along the trajectory.
Under this setting, we provide the first generalization upper bound on the learning error under mild assumptions.
Our results are algorithm agnostic for a range of loss functions, and only depend linearly on the averaged learning error across the trajectory.
arXiv Detail & Related papers (2022-04-25T13:26:11Z) - Deep transfer learning for partial differential equations under
conditional shift with DeepONet [0.0]
We propose a novel TL framework for task-specific learning under conditional shift with a deep operator network (DeepONet)
Inspired by the conditional embedding operator theory, we measure the statistical distance between the source domain and the target feature domain.
We show that the proposed TL framework enables fast and efficient multi-task operator learning, despite significant differences between the source and target domains.
arXiv Detail & Related papers (2022-04-20T23:23:38Z) - Coarse to Fine: Domain Adaptive Crowd Counting via Adversarial Scoring
Network [58.05473757538834]
This paper proposes a novel adversarial scoring network (ASNet) to bridge the gap across domains from coarse to fine granularity.
Three sets of migration experiments show that the proposed methods achieve state-of-the-art counting performance.
arXiv Detail & Related papers (2021-07-27T14:47:24Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.