Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained
Models for Spatiotemporal Modeling
- URL: http://arxiv.org/abs/2403.06978v1
- Date: Mon, 11 Mar 2024 17:59:41 GMT
- Title: Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained
Models for Spatiotemporal Modeling
- Authors: Wele Gedara Chaminda Bandara and Vishal M. Patel
- Abstract summary: We introduce Attention Prompt Tuning (APT) for video-based applications such as action recognition.
APT involves injecting a set of learnable prompts along with data tokens during fine-tuning while keeping the backbone frozen.
The proposed approach greatly reduces the number of FLOPs and latency while achieving a significant performance boost.
- Score: 32.603558214472265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce Attention Prompt Tuning (APT) - a computationally
efficient variant of prompt tuning for video-based applications such as action
recognition. Prompt tuning approaches involve injecting a set of learnable
prompts along with data tokens during fine-tuning while keeping the backbone
frozen. This approach greatly reduces the number of learnable parameters
compared to full tuning. For image-based downstream tasks, normally a couple of
learnable prompts achieve results close to those of full tuning. However,
videos, which contain more complex spatiotemporal information, require hundreds
of tunable prompts to achieve reasonably good results. This reduces the
parameter efficiency observed in images and significantly increases latency and
the number of floating-point operations (FLOPs) during inference. To tackle
these issues, we directly inject the prompts into the keys and values of the
non-local attention mechanism within the transformer block. Additionally, we
introduce a novel prompt reparameterization technique to make APT more robust
against hyperparameter selection. The proposed APT approach greatly reduces the
number of FLOPs and latency while achieving a significant performance boost
over the existing parameter-efficient tuning methods on UCF101, HMDB51, and
SSv2 datasets for action recognition. The code and pre-trained models are
available at https://github.com/wgcban/apt
Related papers
- Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation.
DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z) - E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive.
We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation.
Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Do We Really Need a Large Number of Visual Prompts? [23.85637456240694]
We analyze the impact of the number of prompts on fine-tuning performance and self-attention operation in a vision transformer architecture.
We propose a Prompt Condensation (PC) technique that aims to prevent performance degradation from using a small number of prompts.
arXiv Detail & Related papers (2023-05-26T19:31:57Z) - Residual Prompt Tuning: Improving Prompt Tuning with Residual
Reparameterization [57.379285443780894]
Residual Prompt Tuning is a simple and efficient method that significantly improves the performance and stability of prompt tuning.
We show that our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance.
arXiv Detail & Related papers (2023-05-06T05:35:14Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs)
We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers.
We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.