Consolidator: Mergeable Adapter with Grouped Connections for Visual
Adaptation
- URL: http://arxiv.org/abs/2305.00603v1
- Date: Sun, 30 Apr 2023 23:59:02 GMT
- Title: Consolidator: Mergeable Adapter with Grouped Connections for Visual
Adaptation
- Authors: Tianxiang Hao, Hui Chen, Yuchen Guo and Guiguang Ding
- Abstract summary: We show how to efficiently and effectively transfer knowledge in a vision transformer.
We propose consolidator to modify the pre-trained model with the addition of a small set of tunable parameters.
Our consolidator can reach up to 7.56 better accuracy than full fine-tuning with merely 0.35% parameters.
- Score: 53.835365470800916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, transformers have shown strong ability as visual feature
extractors, surpassing traditional convolution-based models in various
scenarios. However, the success of vision transformers largely owes to their
capacity to accommodate numerous parameters. As a result, new challenges for
adapting large models to downstream tasks arise. On the one hand, classic
fine-tuning tunes all parameters in a huge model for every task and thus easily
falls into overfitting, leading to inferior performance. On the other hand, on
resource-limited devices, fine-tuning stores a full copy of parameters and thus
is usually impracticable for the shortage of storage space. However, few works
have focused on how to efficiently and effectively transfer knowledge in a
vision transformer. Existing methods did not dive into the properties of visual
features, leading to inferior performance. Moreover, some of them bring heavy
inference cost though benefiting storage. To tackle these problems, we propose
consolidator to modify the pre-trained model with the addition of a small set
of tunable parameters to temporarily store the task-specific knowledge while
freezing the backbone model. Motivated by the success of group-wise
convolution, we adopt grouped connections across the features extracted by
fully connected layers to construct tunable parts in a consolidator. To further
enhance the model's capacity to transfer knowledge under a constrained storage
budget and keep inference efficient, we consolidate the parameters in two
stages: 1. between adaptation and storage, and 2. between loading and
inference. On a series of downstream visual tasks, our consolidator can reach
up to 7.56 better accuracy than full fine-tuning with merely 0.35% parameters,
and outperform state-of-the-art parameter-efficient tuning methods by a clear
margin. Code is available at https://github.com/beyondhtx/Consolidator.
Related papers
- Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters.
We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model.
Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z) - Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision [52.80792724919329]
We introduce a novel framework named Adapter-X to improve fine-tuning in 2D image and 3D point cloud modalities.
It is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks.
arXiv Detail & Related papers (2024-06-05T08:26:44Z) - Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models.
Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs.
In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z) - Prompt Guided Transformer for Multi-Task Dense Prediction [14.815576352301322]
We introduce a lightweight task-conditional model called Prompt Guided Transformer to optimize performance and model parameters.
Our approach achieves state-of-the-art results among task-conditional methods while using fewer parameters, and maintains a significant balance between performance and parameter size.
arXiv Detail & Related papers (2023-07-28T07:25:57Z) - Rethinking Efficient Tuning Methods from a Unified Perspective [34.67645496324432]
We revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning.
The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning.
arXiv Detail & Related papers (2023-03-01T17:38:03Z) - Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision
Tasks [36.34331439747556]
We propose Polyhistor and Polyhistor-Lite to share information across different tasks with a few trainable parameters.
Specifically, Polyhistor achieves competitive accuracy compared to the state-of-the-art while only using 10% of their trainable parameters.
arXiv Detail & Related papers (2022-10-07T00:25:02Z) - AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large
Language Models [119.7093605087114]
Fine-tuning large-scale pre-trained language models to downstream tasks require updating hundreds of millions of parameters.
This not only increases the serving cost to store a large copy of the model weights for every task, but also exhibits instability during few-shot task adaptation.
We introduce a new mechanism to improve adapter capacity without increasing parameters or computational cost by two key techniques.
arXiv Detail & Related papers (2022-05-24T23:41:22Z) - AdapterBias: Parameter-efficient Token-dependent Representation Shift
for Adapters in NLP Tasks [55.705355299065474]
Transformer-based pre-trained models with millions of parameters require large storage.
Recent approaches tackle this shortcoming by training adapters, but these approaches still require a relatively large number of parameters.
In this study, AdapterBias, a surprisingly simple yet effective adapter architecture, is proposed.
arXiv Detail & Related papers (2022-04-30T16:49:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.