Related papers: Task Residual for Tuning Vision-Language Models

Task Residual for Tuning Vision-Language Models

URL: http://arxiv.org/abs/2211.10277v2
Date: Fri, 24 Mar 2023 17:22:32 GMT
Title: Task Residual for Tuning Vision-Language Models
Authors: Tao Yu, Zhihe Lu, Xin Jin, Zhibo Chen, Xinchao Wang
Abstract summary: We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes) TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task. The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
Score: 69.22958802711017
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale vision-language models (VLMs) pre-trained on billion-level data have learned general visual representations and broad visual concepts. In principle, the well-learned knowledge structure of the VLMs should be inherited appropriately when being transferred to downstream tasks with limited data. However, most existing efficient transfer learning (ETL) approaches for VLMs either damage or are excessively biased towards the prior knowledge, e.g., prompt tuning (PT) discards the pre-trained text-based classifier and builds a new one while adapter-style tuning (AT) fully relies on the pre-trained features. To address this, we propose a new efficient tuning approach for VLMs named Task Residual Tuning (TaskRes), which performs directly on the text-based classifier and explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task. Specifically, TaskRes keeps the original classifier weights from the VLMs frozen and obtains a new classifier for the target task by tuning a set of prior-independent parameters as a residual to the original one, which enables reliable prior knowledge preservation and flexible task-specific knowledge exploration. The proposed TaskRes is simple yet effective, which significantly outperforms previous ETL methods (e.g., PT and AT) on 11 benchmark datasets while requiring minimal effort for the implementation. Our code is available at https://github.com/geekyutao/TaskRes.

Related papers

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners [8.707819647492467]
We explore capturing the task-specific information via meticulous refinement of entire Vision-Language Models (VLMs) To mitigate these issues, we propose a framework named CLIP-CITE via designing a discriminative visual-text task.
arXiv Detail & Related papers (2024-07-04T15:22:54Z)
Conditional Prototype Rectification Prompt Learning [32.533844163120875]
We propose a Prototype Rectification Prompt Learning (CPR) method to correct the bias of base examples and augment limited data in an effective way. CPR achieves state-of-the-art performance on both few-shot classification and base-to-new generalization tasks.
arXiv Detail & Related papers (2024-04-15T15:43:52Z)
Exploring Efficient Few-shot Adaptation for Vision Transformers [70.91692521825405]
We propose a novel efficient Transformer Tuning (eTT) method that facilitates finetuning ViTs in the Few-shot Learning tasks. Key novelties come from the newly presented Attentive Prefix Tuning (APT) and Domain Residual Adapter (DRA) We conduct extensive experiments to show the efficacy of our model.
arXiv Detail & Related papers (2023-01-06T08:42:05Z)
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning [57.86651057895222]
We introduce the first Continual Data-Free Structured VL Concepts Learning (ConStruct-VL) benchmark. We propose a data-free method comprised of a new approach of Adrial Pseudo-Replay (APR) which generates adversarial reminders of past tasks from past task models. We show this approach outperforms all data-free methods by as much as 7% while even matching some levels of experience-replay.
arXiv Detail & Related papers (2022-11-17T18:57:03Z)
Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts. IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z)
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors [59.93972277761501]
We show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches. This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks.
arXiv Detail & Related papers (2022-05-20T16:19:30Z)
Continual Class Incremental Learning for CT Thoracic Segmentation [36.45569352490318]
Deep learning organ segmentation approaches require large amounts of annotated training data, which is limited in supply due to reasons of confidentiality and the time required for expert manual annotation. Being able to train models incrementally without having access to previously used data is desirable. In this setting, a model learns a new task effectively, but loses performance on previously learned tasks. The Learning without Forgetting (LwF) approach addresses this issue via replaying its own prediction for past tasks during model training. We show that LwF can successfully retain knowledge on previous segmentations, however, its ability to learn a new class decreases with the
arXiv Detail & Related papers (2020-08-12T20:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.