Domain Generalization using Pretrained Models without Fine-tuning
- URL: http://arxiv.org/abs/2203.04600v1
- Date: Wed, 9 Mar 2022 09:33:59 GMT
- Title: Domain Generalization using Pretrained Models without Fine-tuning
- Authors: Ziyue Li, Kan Ren, Xinyang Jiang, Bo Li, Haipeng Zhang, Dongsheng Li
- Abstract summary: Fine-tuning pretrained models is a common practice in domain generalization (DG) tasks.
We propose a novel domain generalization paradigm to better leverage various pretrained models, named specialized ensemble learning for domain generalization (SEDGE)
SEDGE achieves significant performance improvements comparing to strong baselines including state-of-the-art method in DG tasks.
- Score: 25.489714555859944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning pretrained models is a common practice in domain generalization
(DG) tasks. However, fine-tuning is usually computationally expensive due to
the ever-growing size of pretrained models. More importantly, it may cause
over-fitting on source domain and compromise their generalization ability as
shown in recent works. Generally, pretrained models possess some level of
generalization ability and can achieve decent performance regarding specific
domains and samples. However, the generalization performance of pretrained
models could vary significantly over different test domains even samples, which
raises challenges for us to best leverage pretrained models in DG tasks. In
this paper, we propose a novel domain generalization paradigm to better
leverage various pretrained models, named specialized ensemble learning for
domain generalization (SEDGE). It first trains a linear label space adapter
upon fixed pretrained models, which transforms the outputs of the pretrained
model to the label space of the target domain. Then, an ensemble network aware
of model specialty is proposed to dynamically dispatch proper pretrained models
to predict each test sample. Experimental studies on several benchmarks show
that SEDGE achieves significant performance improvements comparing to strong
baselines including state-of-the-art method in DG tasks and reduces the
trainable parameters by ~99% and the training time by ~99.5%.
Related papers
- Domain Generalization Guided by Large-Scale Pre-Trained Priors [24.74398777539288]
Domain generalization (DG) aims to train a model from limited source domains, allowing it to generalize to unknown target domains.
We introduce Fine-Tune with Large-scale pre-trained Priors (FT-LP)
FT-LP incorporates the pre-trained model as a prior into the DG fine-tuning process, ensuring that the model refers to its pre-trained model at each optimization step.
arXiv Detail & Related papers (2024-06-09T03:32:32Z) - LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views [28.081794908107604]
Fine-tuning is used to leverage the power of pre-trained foundation models in new downstream tasks.
Recent studies have observed challenges in the generalization of fine-tuned models to unseen distributions.
We propose a novel generalizable fine-tuning method LEVI, where the pre-trained model is adaptively ensembled layer-wise with a small task-specific model.
arXiv Detail & Related papers (2024-02-07T08:16:40Z) - Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration [11.102950630209879]
In out-of-distribution (OOD) generalization tasks, fine-tuning pre-trained models has become a prevalent strategy.
We examined how pre-trained model size, pre-training dataset size, and training strategies impact generalization and uncertainty calibration.
arXiv Detail & Related papers (2023-07-17T01:27:10Z) - Universal Semi-supervised Model Adaptation via Collaborative Consistency
Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA)
We propose a collaborative consistency training framework that regularizes the prediction consistency between two models.
Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z) - Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained
Models [6.3671178249601805]
Large-scale pre-trained models can enhance domain generalization by leveraging their generalization power.
These pre-trained models lack target task-specific knowledge yet due to discrepancies between the pre-training objectives and the target task.
We propose a new domain generalization method that estimates unobservable gradients that reduce potential risks in unseen domains.
arXiv Detail & Related papers (2023-02-03T02:12:09Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - Learning to Generalize across Domains on Single Test Samples [126.9447368941314]
We learn to generalize across domains on single test samples.
We formulate the adaptation to the single test sample as a variational Bayesian inference problem.
Our model achieves at least comparable -- and often better -- performance than state-of-the-art methods on multiple benchmarks for domain generalization.
arXiv Detail & Related papers (2022-02-16T13:21:04Z) - Adapt-and-Distill: Developing Small, Fast and Effective Pretrained
Language Models for Domains [45.07506437436464]
We present a general approach to developing small, fast and effective pre-trained models for specific domains.
This is achieved by adapting the off-the-shelf general pre-trained models and performing task-agnostic knowledge distillation in target domains.
arXiv Detail & Related papers (2021-06-25T07:37:05Z) - Improving QA Generalization by Concurrent Modeling of Multiple Biases [61.597362592536896]
Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets.
We propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data.
We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths.
arXiv Detail & Related papers (2020-10-07T11:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.