Explore and Exploit the Diverse Knowledge in Model Zoo for Domain
Generalization
- URL: http://arxiv.org/abs/2306.02595v1
- Date: Mon, 5 Jun 2023 04:58:41 GMT
- Title: Explore and Exploit the Diverse Knowledge in Model Zoo for Domain
Generalization
- Authors: Yimeng Chen, Tianyang Hu, Fengwei Zhou, Zhenguo Li, Zhiming Ma
- Abstract summary: We propose a new algorithm for integrating diverse pretrained models, not limited to the strongest models, in order to achieve enhanced out-of-distribution generalization performance.
Our proposed method demonstrates state-of-the-art empirical results on a variety of datasets, thus validating the benefits of utilizing diverse knowledge.
- Score: 40.28810906825559
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proliferation of pretrained models, as a result of advancements in
pretraining techniques, has led to the emergence of a vast zoo of publicly
available models. Effectively utilizing these resources to obtain models with
robust out-of-distribution generalization capabilities for downstream tasks has
become a crucial area of research. Previous research has primarily focused on
identifying the most powerful models within the model zoo, neglecting to fully
leverage the diverse inductive biases contained within. This paper argues that
the knowledge contained in weaker models is valuable and presents a method for
leveraging the diversity within the model zoo to improve out-of-distribution
generalization capabilities. Specifically, we investigate the behaviors of
various pretrained models across different domains of downstream tasks by
characterizing the variations in their encoded representations in terms of two
dimensions: diversity shift and correlation shift. This characterization
enables us to propose a new algorithm for integrating diverse pretrained
models, not limited to the strongest models, in order to achieve enhanced
out-of-distribution generalization performance. Our proposed method
demonstrates state-of-the-art empirical results on a variety of datasets, thus
validating the benefits of utilizing diverse knowledge.
Related papers
- Learning Multimodal Latent Generative Models with Energy-Based Prior [3.6648642834198797]
We propose a novel framework that integrates the latent generative model with the EBM.
This approach results in a more expressive and informative prior, better-capturing of information across multiple modalities.
arXiv Detail & Related papers (2024-09-30T01:38:26Z) - Bridging Generative and Discriminative Models for Unified Visual
Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors.
Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages.
The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Towards Mode Balancing of Generative Models via Diversity Weights [1.2354076490479513]
We present diversity weights, a training scheme that increases a model's output diversity by balancing the modes in the training dataset.
We discuss connections of our approach to diversity, equity, and inclusion in generative machine learning more generally, and computational creativity specifically.
arXiv Detail & Related papers (2023-04-24T09:55:17Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z) - Sample Efficient Reinforcement Learning via Model-Ensemble Exploration
and Exploitation [3.728946517493471]
MEEE is a model-ensemble method that consists of optimistic exploration and weighted exploitation.
Our approach outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity.
arXiv Detail & Related papers (2021-07-05T07:18:20Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.