Concept-wise Fine-tuning Matters in Preventing Negative Transfer
- URL: http://arxiv.org/abs/2311.06868v1
- Date: Sun, 12 Nov 2023 14:58:11 GMT
- Title: Concept-wise Fine-tuning Matters in Preventing Negative Transfer
- Authors: Yunqiao Yang, Long-Kai Huang, Ying Wei
- Abstract summary: Off-the-shelf finetuning techniques are far from adequate to mitigate negative transfer caused by two types of underperforming features in a pre-trained model.
We propose a Concept-wise fine-tuning (Concept-Tuning) approach which refines feature representations in the level of patches with each patch encoding a concept.
- Score: 17.060892283250215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A multitude of prevalent pre-trained models mark a major milestone in the
development of artificial intelligence, while fine-tuning has been a common
practice that enables pretrained models to figure prominently in a wide array
of target datasets. Our empirical results reveal that off-the-shelf finetuning
techniques are far from adequate to mitigate negative transfer caused by two
types of underperforming features in a pre-trained model, including rare
features and spuriously correlated features. Rooted in structural causal models
of predictions after fine-tuning, we propose a Concept-wise fine-tuning
(Concept-Tuning) approach which refines feature representations in the level of
patches with each patch encoding a concept. Concept-Tuning minimizes the
negative impacts of rare features and spuriously correlated features by (1)
maximizing the mutual information between examples in the same category with
regard to a slice of rare features (a patch) and (2) applying front-door
adjustment via attention neural networks in channels and feature slices
(patches). The proposed Concept-Tuning consistently and significantly (by up to
4.76%) improves prior state-of-the-art fine-tuning methods on eleven datasets,
diverse pre-training strategies (supervised and self-supervised ones), various
network architectures, and sample sizes in a target dataset.
Related papers
- HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters [53.97380482341493]
"pre-train, prompt-tuning" has demonstrated impressive performance for tuning pre-trained heterogeneous graph neural networks (HGNNs)
We propose a unified framework that combines two new adapters with potential labeled data extension to improve the generalization of pre-trained HGNN models.
arXiv Detail & Related papers (2024-11-02T06:43:54Z) - Revisiting the Robust Generalization of Adversarial Prompt Tuning [4.033827046965844]
We propose an adaptive Consistency-guided Adrial Prompt Tuning (i.e., CAPT) framework to enhance the alignment of image and text features for adversarial examples.
We conduct experiments across 14 datasets and 4 data sparsity schemes to show the superiority of CAPT over other state-of-the-art adaption methods.
arXiv Detail & Related papers (2024-05-18T02:54:41Z) - What Matters When Repurposing Diffusion Models for General Dense Perception Tasks? [49.84679952948808]
Recent works show promising results by simply fine-tuning T2I diffusion models for dense perception tasks.
We conduct a thorough investigation into critical factors that affect transfer efficiency and performance when using diffusion priors.
Our work culminates in the development of GenPercept, an effective deterministic one-step fine-tuning paradigm tailed for dense visual perception tasks.
arXiv Detail & Related papers (2024-03-10T04:23:24Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z) - Towards Trustworthy Predictions from Deep Neural Networks with Fast
Adversarial Calibration [2.8935588665357077]
We propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift.
We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions.
arXiv Detail & Related papers (2020-12-20T13:39:29Z) - Bi-tuning of Pre-trained Representations [79.58542780707441]
Bi-tuning is a general learning framework to fine-tune both supervised and unsupervised pre-trained representations to downstream tasks.
Bi-tuning generalizes the vanilla fine-tuning by integrating two heads upon the backbone of pre-trained representations.
Bi-tuning achieves state-of-the-art results for fine-tuning tasks of both supervised and unsupervised pre-trained models by large margins.
arXiv Detail & Related papers (2020-11-12T03:32:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.