Is Large-Scale Pretraining the Secret to Good Domain Generalization?
- URL: http://arxiv.org/abs/2412.02856v2
- Date: Thu, 23 Jan 2025 15:09:35 GMT
- Title: Is Large-Scale Pretraining the Secret to Good Domain Generalization?
- Authors: Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis, Bryan A. Plummer, Kate Saenko,
- Abstract summary: Multi-Source Domain Generalization (DG) is the task of training on multiple source domains and achieving high classification performance on unseen target domains.
Recent methods combine robust features from web-scale pretrained backbones with new features learned from source data, and this has dramatically improved benchmark results.
We show that all evaluated DG methods struggle on DomainBed-OOP, while recent methods excel on DomainBed-IP.
- Score: 69.80606575323691
- License:
- Abstract: Multi-Source Domain Generalization (DG) is the task of training on multiple source domains and achieving high classification performance on unseen target domains. Recent methods combine robust features from web-scale pretrained backbones with new features learned from source data, and this has dramatically improved benchmark results. However, it remains unclear if DG finetuning methods are becoming better over time, or if improved benchmark performance is simply an artifact of stronger pre-training. Prior studies have shown that perceptual similarity to pre-training data correlates with zero-shot performance, but we find the effect limited in the DG setting. Instead, we posit that having perceptually similar data in pretraining is not enough; and that it is how well these data were learned that determines performance. This leads us to introduce the Alignment Hypothesis, which states that the final DG performance will be high if and only if alignment of image and class label text embeddings is high. Our experiments confirm the Alignment Hypothesis is true, and we use it as an analysis tool of existing DG methods evaluated on DomainBed datasets by splitting evaluation data into In-pretraining (IP) and Out-of-pretraining (OOP). We show that all evaluated DG methods struggle on DomainBed-OOP, while recent methods excel on DomainBed-IP. Put together, our findings highlight the need for DG methods which can generalize beyond pretraining alignment.
Related papers
- DoGE: Domain Reweighting with Generalization Estimation [42.32000165235568]
We propose DOmain reweighting with Generalization Estimation (DoGE)
In our experiments, we extensively show how DoGE improves the generalization of the base model to any target data mixture.
DoGE can effectively identify inter-domain dependencies, and consistently achieves better test perplexity on the target domain.
arXiv Detail & Related papers (2023-10-23T22:51:58Z) - Face Presentation Attack Detection by Excavating Causal Clues and
Adapting Embedding Statistics [9.612556145185431]
Face presentation attack detection (PAD) uses domain adaptation (DA) and domain generalization (DG) techniques to address performance degradation on unknown domains.
Most DG-based PAD solutions rely on a priori, i.e., known domain labels.
This paper proposes to model face PAD as a compound DG task from a causal perspective, linking it to model optimization.
arXiv Detail & Related papers (2023-08-28T13:11:05Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - On Certifying and Improving Generalization to Unseen Domains [87.00662852876177]
Domain Generalization aims to learn models whose performance remains high on unseen domains encountered at test-time.
It is challenging to evaluate DG algorithms comprehensively using a few benchmark datasets.
We propose a universal certification framework that can efficiently certify the worst-case performance of any DG method.
arXiv Detail & Related papers (2022-06-24T16:29:43Z) - Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised
Domain Adaptation [88.5448806952394]
We consider unsupervised domain adaptation (UDA), where labeled data from a source domain and unlabeled data from a target domain are used to learn a classifier for the target domain.
We show that contrastive pre-training, which learns features on unlabeled source and target data and then fine-tunes on labeled source data, is competitive with strong UDA methods.
arXiv Detail & Related papers (2022-04-01T16:56:26Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Reappraising Domain Generalization in Neural Networks [8.06370138649329]
Domain generalization (DG) of machine learning algorithms is defined as their ability to learn a domain agnostic hypothesis from multiple training distributions.
We find that a straightforward Empirical Risk Minimization (ERM) baseline consistently outperforms existing DG methods.
We propose a classwise-DG formulation, where for each class, we randomly select one of the domains and keep it aside for testing.
arXiv Detail & Related papers (2021-10-15T10:06:40Z) - Self-Domain Adaptation for Face Anti-Spoofing [31.441928816043536]
We propose a self-domain adaptation framework to leverage the unlabeled test domain data at inference.
A meta-learning based adaptor learning algorithm is proposed using the data of multiple source domains at the training step.
arXiv Detail & Related papers (2021-02-24T08:46:39Z) - Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions.
We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.