DR-Tune: Improving Fine-tuning of Pretrained Visual Models by
Distribution Regularization with Semantic Calibration
- URL: http://arxiv.org/abs/2308.12058v1
- Date: Wed, 23 Aug 2023 10:59:20 GMT
- Title: DR-Tune: Improving Fine-tuning of Pretrained Visual Models by
Distribution Regularization with Semantic Calibration
- Authors: Nan Zhou, Jiaxin Chen, Di Huang
- Abstract summary: We propose a novel fine-tuning framework, namely distribution regularization with semantic calibration (DR-Tune)
DR-Tune employs distribution regularization by enforcing the downstream task head to decrease its classification error on the pretrained feature distribution.
To alleviate the interference by semantic drift, we develop the semantic calibration (SC) module.
- Score: 38.4461170690033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The visual models pretrained on large-scale benchmarks encode general
knowledge and prove effective in building more powerful representations for
downstream tasks. Most existing approaches follow the fine-tuning paradigm,
either by initializing or regularizing the downstream model based on the
pretrained one. The former fails to retain the knowledge in the successive
fine-tuning phase, thereby prone to be over-fitting, and the latter imposes
strong constraints to the weights or feature maps of the downstream model
without considering semantic drift, often incurring insufficient optimization.
To deal with these issues, we propose a novel fine-tuning framework, namely
distribution regularization with semantic calibration (DR-Tune). It employs
distribution regularization by enforcing the downstream task head to decrease
its classification error on the pretrained feature distribution, which prevents
it from over-fitting while enabling sufficient training of downstream encoders.
Furthermore, to alleviate the interference by semantic drift, we develop the
semantic calibration (SC) module to align the global shape and class centers of
the pretrained and downstream feature distributions. Extensive experiments on
widely used image classification datasets show that DR-Tune consistently
improves the performance when combing with various backbones under different
pretraining strategies. Code is available at:
https://github.com/weeknan/DR-Tune.
Related papers
- Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained
Models in Few-Shot Learning [21.693779973263172]
In this paper, we introduce a fine-tuning approach termed Feature Discrimination Alignment (FD-Align)
Our method aims to bolster the model's generalizability by preserving the consistency of spurious features.
Once fine-tuned, the model can seamlessly integrate with existing methods, leading to performance improvements.
arXiv Detail & Related papers (2023-10-23T17:12:01Z) - Generalized Logit Adjustment: Calibrating Fine-tuned Models by Removing Label Bias in Foundation Models [75.9543301303586]
Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data.
Fine-tuning and ensembling are also commonly adopted to better fit the downstream tasks.
However, we argue that prior work has overlooked the inherent biases in foundation models.
arXiv Detail & Related papers (2023-10-12T08:01:11Z) - Distributionally Robust Post-hoc Classifiers under Prior Shifts [31.237674771958165]
We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors.
We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model.
arXiv Detail & Related papers (2023-09-16T00:54:57Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Improved Visual Fine-tuning with Natural Language Supervision [36.250244364023665]
Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data.
The problem of catastrophic forgetting in pre-trained backbone has been extensively studied for fine-tuning.
We introduce a reference distribution obtained from a fixed text classifier, which can help regularize the learned vision classifier.
arXiv Detail & Related papers (2023-04-04T03:08:02Z) - Debiased Fine-Tuning for Vision-language Models by Prompt Regularization [50.41984119504716]
We present a new paradigm for fine-tuning large-scale vision pre-trained models on downstream task, dubbed Prompt Regularization (ProReg)
ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning.
We show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.
arXiv Detail & Related papers (2023-01-29T11:53:55Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.