Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained
Models
- URL: http://arxiv.org/abs/2302.01497v3
- Date: Sat, 9 Sep 2023 08:23:20 GMT
- Title: Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained
Models
- Authors: Byounggyu Lew, Donghyun Son, Buru Chang
- Abstract summary: Large-scale pre-trained models can enhance domain generalization by leveraging their generalization power.
These pre-trained models lack target task-specific knowledge yet due to discrepancies between the pre-training objectives and the target task.
We propose a new domain generalization method that estimates unobservable gradients that reduce potential risks in unseen domains.
- Score: 6.3671178249601805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain generalization aims to build generalized models that perform well on
unseen domains when only source domains are available for model optimization.
Recent studies have shown that large-scale pre-trained models can enhance
domain generalization by leveraging their generalization power. However, these
pre-trained models lack target task-specific knowledge yet due to discrepancies
between the pre-training objectives and the target task. Although the
task-specific knowledge could be learned from source domains by fine-tuning,
this hurts the generalization power of pre-trained models due to gradient bias
toward the source domains. To alleviate this problem, we propose a new domain
generalization method that estimates unobservable gradients that reduce
potential risks in unseen domains using a large-scale pre-trained model. These
estimated unobservable gradients allow the pre-trained model to learn
task-specific knowledge further while preserving its generalization ability by
relieving the gradient bias. Our experimental results show that our method
outperforms baseline methods on DomainBed, a standard benchmark in domain
generalization. We also provide extensive analyses to demonstrate that the
pre-trained model can learn task-specific knowledge without sacrificing its
generalization power.
Related papers
- Domain Generalization Guided by Large-Scale Pre-Trained Priors [24.74398777539288]
Domain generalization (DG) aims to train a model from limited source domains, allowing it to generalize to unknown target domains.
We introduce Fine-Tune with Large-scale pre-trained Priors (FT-LP)
FT-LP incorporates the pre-trained model as a prior into the DG fine-tuning process, ensuring that the model refers to its pre-trained model at each optimization step.
arXiv Detail & Related papers (2024-06-09T03:32:32Z) - On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization.
This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase.
Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - Not to Overfit or Underfit? A Study of Domain Generalization in Question
Answering [18.22045610080848]
Machine learning models are prone to overfitting their source (training) distributions.
Here we examine the contrasting view that multi-source domain generalization (DG) is in fact a problem of mitigating source domain underfitting.
arXiv Detail & Related papers (2022-05-15T10:53:40Z) - Domain Generalization using Pretrained Models without Fine-tuning [25.489714555859944]
Fine-tuning pretrained models is a common practice in domain generalization (DG) tasks.
We propose a novel domain generalization paradigm to better leverage various pretrained models, named specialized ensemble learning for domain generalization (SEDGE)
SEDGE achieves significant performance improvements comparing to strong baselines including state-of-the-art method in DG tasks.
arXiv Detail & Related papers (2022-03-09T09:33:59Z) - Debiased Batch Normalization via Gaussian Process for Generalizable
Person Re-Identification [84.32086702849338]
Generalizable person re-identification aims to learn a model with only several labeled source domains that can perform well on unseen domains.
We propose a novel Debiased Batch Normalization via Gaussian Process approach (GDNorm) for generalizable person re-identification.
arXiv Detail & Related papers (2022-03-03T14:14:51Z) - Towards Data-Free Domain Generalization [12.269045654957765]
How can knowledge contained in models trained on different source data domains be merged into a single model that generalizes well to unseen target domains?
Prior domain generalization methods typically rely on using source domain data, making them unsuitable for private decentralized data.
We propose DEKAN, an approach that extracts and fuses domain-specific knowledge from the available teacher models into a student model robust to domain shift.
arXiv Detail & Related papers (2021-10-09T11:44:05Z) - Self-balanced Learning For Domain Generalization [64.99791119112503]
Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.
Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class.
We propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data.
arXiv Detail & Related papers (2021-08-31T03:17:54Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.