AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks
- URL: http://arxiv.org/abs/2401.10544v1
- Date: Fri, 19 Jan 2024 08:07:59 GMT
- Title: AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks
- Authors: Yun Liang, Hai Lin, Shaojian Qiu, Yihang Zhang
- Abstract summary: We propose an efficient fine-tuning approach based on Adapter tuning, namely AAT.
Our method achieves performance comparable to or even superior to full fine-tuning while optimizing only 7.118% of the parameters.
- Score: 4.789838330230841
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Transformers have been introduced into the field of acoustics
recognition. They are pre-trained on large-scale datasets using methods such as
supervised learning and semi-supervised learning, demonstrating robust
generality--It fine-tunes easily to downstream tasks and shows more robust
performance. However, the predominant fine-tuning method currently used is
still full fine-tuning, which involves updating all parameters during training.
This not only incurs significant memory usage and time costs but also
compromises the model's generality. Other fine-tuning methods either struggle
to address this issue or fail to achieve matching performance. Therefore, we
conducted a comprehensive analysis of existing fine-tuning methods and proposed
an efficient fine-tuning approach based on Adapter tuning, namely AAT. The core
idea is to freeze the audio Transformer model and insert extra learnable
Adapters, efficiently acquiring downstream task knowledge without compromising
the model's original generality. Extensive experiments have shown that our
method achieves performance comparable to or even superior to full fine-tuning
while optimizing only 7.118% of the parameters. It also demonstrates
superiority over other fine-tuning methods.
Related papers
- iDAT: inverse Distillation Adapter-Tuning [15.485126287621439]
Adapter-Tuning (AT) method involves freezing a pre-trained model and introducing trainable adapter modules to acquire downstream knowledge.
This paper proposes a distillation framework for the AT method instead of crafting a carefully designed adapter module.
arXiv Detail & Related papers (2024-03-23T07:36:58Z) - Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation.
DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z) - Efficient Adapter Tuning of Pre-trained Speech Models for Automatic
Speaker Verification [38.20393847192532]
Self-supervised speech models have shown impressive performance on various downstream speech tasks.
fine-tuning becomes practically unfeasible due to heavy computation and storage overhead.
We propose an effective adapter framework designed for adapting self-supervised speech models to the speaker verification task.
arXiv Detail & Related papers (2024-03-01T05:32:14Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - An Empirical Analysis of Parameter-Efficient Methods for Debiasing
Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance.
We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective.
We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z) - Evaluating Parameter-Efficient Transfer Learning Approaches on SURE
Benchmark for Speech Understanding [40.27182770995891]
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models.
We introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks.
arXiv Detail & Related papers (2023-03-02T08:57:33Z) - Scaling & Shifting Your Features: A New Baseline for Efficient Model
Tuning [126.84770886628833]
Existing finetuning methods either tune all parameters of the pretrained model (full finetuning) or only tune the last linear layer (linear probing)
We propose a new parameter-efficient finetuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance full finetuning.
arXiv Detail & Related papers (2022-10-17T08:14:49Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Visual Prompt Tuning [74.5309408185523]
This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision.
VPT introduces only a small amount (less than 1% of model parameters) of trainable parameters in the input space while keeping the model backbone frozen.
arXiv Detail & Related papers (2022-03-23T01:17:16Z) - On the Effectiveness of Adapter-based Tuning for Pretrained Language
Model Adaptation [36.37565646597464]
adapter-based tuning works by adding light-weight adapter modules to a pretrained language model (PrLM)
It adds only a few trainable parameters per new task, allowing a high degree of parameter sharing.
We demonstrate that adapter-based tuning outperforms fine-tuning on low-resource and cross-lingual tasks.
arXiv Detail & Related papers (2021-06-06T16:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.