Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters
- URL: http://arxiv.org/abs/2310.11031v2
- Date: Sat, 07 Dec 2024 06:57:05 GMT
- Title: Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters
- Authors: Gyuseong Lee, Wooseok Jang, Jinhyeon Kim, Jaewoo Jung, Seungryong Kim,
- Abstract summary: This study is on leveraging the knowledge of large pretrained models to improve handling of OOD scenarios and tackle domain generalization problems.
We employ parameter-efficient fine-tuning (PEFT) techniques to effectively preserve OOD robustness while working with large models.
Our experiments and analysis confirm that the most effective approaches involve ensembling diverse models and increasing the scale of pretraining.
- Score: 33.401355417911084
- License:
- Abstract: Learning robust vision models that perform well in out-of-distribution (OOD) situations is an important task for model deployment in real-world settings. Despite extensive research in this field, many proposed methods have only shown minor performance improvements compared to the simplest empirical risk minimization (ERM) approach, which was evaluated on a benchmark with a limited hyperparameter search space. Our focus in this study is on leveraging the knowledge of large pretrained models to improve handling of OOD scenarios and tackle domain generalization problems. However, prior research has revealed that naively fine-tuning a large pretrained model can impair OOD robustness. Thus, we employ parameter-efficient fine-tuning (PEFT) techniques to effectively preserve OOD robustness while working with large models. Our extensive experiments and analysis confirm that the most effective approaches involve ensembling diverse models and increasing the scale of pretraining. As a result, we achieve state-of-the-art performance in domain generalization tasks. Our code and project page are available at: https://cvlab-kaist.github.io/MoA
Related papers
- Feature Protection For Out-of-distribution Generalization [24.072876186625855]
We show that protecting pre-trained features leads to a fine-tuned model more robust to generalization.
We show that protecting pre-trained features leads to a fine-tuned model more robust to OOD generalization.
arXiv Detail & Related papers (2024-05-25T03:00:06Z) - Efficiency at Scale: Investigating the Performance of Diminutive
Language Models in Clinical Tasks [2.834743715323873]
We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks.
Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another.
The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure.
arXiv Detail & Related papers (2024-02-16T11:30:11Z) - Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks.
However, the massive size of these models poses huge challenges for their deployment in real-world applications.
We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z) - Model Agnostic Sample Reweighting for Out-of-Distribution Learning [38.843552982739354]
We propose a principled method, textbfAgnostic samtextbfPLe rtextbfEweighting (textbfMAPLE) to effectively address OOD problem.
Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model leads to superior OOD generalization performance.
arXiv Detail & Related papers (2023-01-24T05:11:03Z) - Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation)
We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others.
These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Sample-Efficient Reinforcement Learning via Conservative Model-Based
Actor-Critic [67.00475077281212]
Model-based reinforcement learning algorithms are more sample efficient than their model-free counterparts.
We propose a novel approach that achieves high sample efficiency without the strong reliance on accurate learned models.
We show that CMBAC significantly outperforms state-of-the-art approaches in terms of sample efficiency on several challenging tasks.
arXiv Detail & Related papers (2021-12-16T15:33:11Z) - Sample Efficient Reinforcement Learning via Model-Ensemble Exploration
and Exploitation [3.728946517493471]
MEEE is a model-ensemble method that consists of optimistic exploration and weighted exploitation.
Our approach outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity.
arXiv Detail & Related papers (2021-07-05T07:18:20Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.