Parameter-efficient Model Adaptation for Vision Transformers
- URL: http://arxiv.org/abs/2203.16329v3
- Date: Thu, 13 Jul 2023 22:12:10 GMT
- Title: Parameter-efficient Model Adaptation for Vision Transformers
- Authors: Xuehai He, Chunyuan Li, Pengchuan Zhang, Jianwei Yang, Xin Eric Wang
- Abstract summary: We study parameter-efficient model adaptation strategies for vision transformers on the image classification task.
We propose a parameter-efficient model adaptation framework, which first selects submodules by measuring local intrinsic dimensions.
Our method performs the best in terms of the tradeoff between accuracy and parameter efficiency across 20 image classification datasets.
- Score: 45.3460867776953
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In computer vision, it has achieved great transfer learning performance via
adapting large-scale pretrained vision models (e.g., vision transformers) to
downstream tasks. Common approaches for model adaptation either update all
model parameters or leverage linear probes. In this paper, we aim to study
parameter-efficient model adaptation strategies for vision transformers on the
image classification task. We formulate efficient model adaptation as a
subspace training problem and perform a comprehensive benchmarking over
different efficient adaptation methods. We conduct an empirical study on each
efficient model adaptation method focusing on its performance alongside
parameter cost. Furthermore, we propose a parameter-efficient model adaptation
framework, which first selects submodules by measuring local intrinsic
dimensions and then projects them into subspace for further decomposition via a
novel Kronecker Adaptation (KAdaptation) method. We analyze and compare our
method with a diverse set of baseline model adaptation methods (including
state-of-the-art methods for pretrained language models). Our method performs
the best in terms of the tradeoff between accuracy and parameter efficiency
across 20 image classification datasets under the few-shot setting and 7 image
classification datasets under the full-shot setting.
Related papers
- Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement [0.7558576228782637]
We propose a framework for efficient Source-Free Domain Adaptation (SFDA)
Our approach introduces an improved paradigm for source-model preparation and target-side adaptation.
We demonstrate that our framework is compatible with various SFDA methods and achieves significant computational efficiency.
arXiv Detail & Related papers (2024-10-03T02:12:03Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - Parameter-Efficient Fine-Tuning With Adapters [5.948206235442328]
This research introduces a novel adaptation method utilizing the UniPELT framework as a base.
Our method employs adapters, which enable efficient transfer of pretrained models to new tasks with minimal retraining of the base model parameters.
arXiv Detail & Related papers (2024-05-09T01:40:38Z) - Prompt-Guided Adaptive Model Transformation for Whole Slide Image Classification [27.21493446754789]
Multiple instance learning (MIL) has emerged as a popular method for classifying histopathology whole slide images (WSIs)
We propose Prompt-guided Adaptive Model Transformation framework that seamlessly adapts pre-trained models to the specific characteristics of histopathology data.
We rigorously evaluate our approach on two datasets, Camelyon16 and TCGA-NSCLC, showcasing substantial improvements across various MIL models.
arXiv Detail & Related papers (2024-03-19T08:23:12Z) - Efficient Adapter Tuning of Pre-trained Speech Models for Automatic
Speaker Verification [38.20393847192532]
Self-supervised speech models have shown impressive performance on various downstream speech tasks.
fine-tuning becomes practically unfeasible due to heavy computation and storage overhead.
We propose an effective adapter framework designed for adapting self-supervised speech models to the speaker verification task.
arXiv Detail & Related papers (2024-03-01T05:32:14Z) - Class Incremental Learning with Pre-trained Vision-Language Models [59.15538370859431]
We propose an approach to exploiting pre-trained vision-language models (e.g. CLIP) that enables further adaptation.
Experiments on several conventional benchmarks consistently show a significant margin of improvement over the current state-of-the-art.
arXiv Detail & Related papers (2023-10-31T10:45:03Z) - Efficient Adaptation of Large Vision Transformer via Adapter
Re-Composing [8.88477151877883]
High-capacity pre-trained models have revolutionized problem-solving in computer vision.
We propose a novel Adapter Re-Composing (ARC) strategy that addresses efficient pre-trained model adaptation.
Our approach considers the reusability of adaptation parameters and introduces a parameter-sharing scheme.
arXiv Detail & Related papers (2023-10-10T01:04:15Z) - E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive.
We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation.
Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z) - Scaling Pre-trained Language Models to Deeper via Parameter-efficient
Architecture [68.13678918660872]
We design a more capable parameter-sharing architecture based on matrix product operator (MPO)
MPO decomposition can reorganize and factorize the information of a parameter matrix into two parts.
Our architecture shares the central tensor across all layers for reducing the model size.
arXiv Detail & Related papers (2023-03-27T02:34:09Z) - Towards a Unified View of Parameter-Efficient Transfer Learning [108.94786930869473]
Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP.
Recent work has proposed a variety of parameter-efficient transfer learning methods that only fine-tune a small number of (extra) parameters to attain strong performance.
We break down the design of state-of-the-art parameter-efficient transfer learning methods and present a unified framework that establishes connections between them.
arXiv Detail & Related papers (2021-10-08T20:22:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.