Related papers: Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling

Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling

URL: http://arxiv.org/abs/2403.06978v1
Date: Mon, 11 Mar 2024 17:59:41 GMT
Title: Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling
Authors: Wele Gedara Chaminda Bandara and Vishal M. Patel
Abstract summary: We introduce Attention Prompt Tuning (APT) for video-based applications such as action recognition. APT involves injecting a set of learnable prompts along with data tokens during fine-tuning while keeping the backbone frozen. The proposed approach greatly reduces the number of FLOPs and latency while achieving a significant performance boost.
Score: 32.603558214472265
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we introduce Attention Prompt Tuning (APT) - a computationally efficient variant of prompt tuning for video-based applications such as action recognition. Prompt tuning approaches involve injecting a set of learnable prompts along with data tokens during fine-tuning while keeping the backbone frozen. This approach greatly reduces the number of learnable parameters compared to full tuning. For image-based downstream tasks, normally a couple of learnable prompts achieve results close to those of full tuning. However, videos, which contain more complex spatiotemporal information, require hundreds of tunable prompts to achieve reasonably good results. This reduces the parameter efficiency observed in images and significantly increases latency and the number of floating-point operations (FLOPs) during inference. To tackle these issues, we directly inject the prompts into the keys and values of the non-local attention mechanism within the transformer block. Additionally, we introduce a novel prompt reparameterization technique to make APT more robust against hyperparameter selection. The proposed APT approach greatly reduces the number of FLOPs and latency while achieving a significant performance boost over the existing parameter-efficient tuning methods on UCF101, HMDB51, and SSv2 datasets for action recognition. The code and pre-trained models are available at https://github.com/wgcban/apt

Related papers

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation. DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z)
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation. Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z)
Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks. We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z)
Do We Really Need a Large Number of Visual Prompts? [23.85637456240694]
We analyze the impact of the number of prompts on fine-tuning performance and self-attention operation in a vision transformer architecture. We propose a Prompt Condensation (PC) technique that aims to prevent performance degradation from using a small number of prompts.
arXiv Detail & Related papers (2023-05-26T19:31:57Z)
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization [57.379285443780894]
Residual Prompt Tuning is a simple and efficient method that significantly improves the performance and stability of prompt tuning. We show that our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance.
arXiv Detail & Related papers (2023-05-06T05:35:14Z)
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual. sensuous-aware fine-Tuning (SPT) scheme. SPT allocates trainable parameters to task-specific important positions. Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs) We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers. We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.