To Adapt or to Fine-tune: A Case Study on Abstractive Summarization
- URL: http://arxiv.org/abs/2208.14559v1
- Date: Tue, 30 Aug 2022 22:48:28 GMT
- Title: To Adapt or to Fine-tune: A Case Study on Abstractive Summarization
- Authors: Zheng Zhao and Pinzhen Chen
- Abstract summary: Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch.
Such models are sluggish to train and accompanied by a massive overhead.
It remains uncertain whether using adapters benefits the task of summarization, in terms of improved efficiency without an unpleasant sacrifice in performance.
- Score: 7.353994554197792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in the field of abstractive summarization leverage
pre-trained language models rather than train a model from scratch. However,
such models are sluggish to train and accompanied by a massive overhead.
Researchers have proposed a few lightweight alternatives such as smaller
adapters to mitigate the drawbacks. Nonetheless, it remains uncertain whether
using adapters benefits the task of summarization, in terms of improved
efficiency without an unpleasant sacrifice in performance. In this work, we
carry out multifaceted investigations on fine-tuning and adapters for
summarization tasks with varying complexity: language, domain, and task
transfer. In our experiments, fine-tuning a pre-trained language model
generally attains a better performance than using adapters; the performance gap
positively correlates with the amount of training data used. Notably, adapters
exceed fine-tuning under extremely low-resource conditions. We further provide
insights on multilinguality, model convergence, and robustness, hoping to shed
light on the pragmatic choice of fine-tuning or adapters in abstractive
summarization.
Related papers
- Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning [55.384428765798496]
New data exhibits a long-tailed distribution, such as e-commerce platform reviews.
This necessitates continuous model learning imbalanced data without forgetting.
We introduce AdaPtive Adapter RouTing (APART) as an exemplar-free solution for LTCIL.
arXiv Detail & Related papers (2024-09-11T17:52:00Z) - Efficient Adapter Tuning of Pre-trained Speech Models for Automatic
Speaker Verification [38.20393847192532]
Self-supervised speech models have shown impressive performance on various downstream speech tasks.
fine-tuning becomes practically unfeasible due to heavy computation and storage overhead.
We propose an effective adapter framework designed for adapting self-supervised speech models to the speaker verification task.
arXiv Detail & Related papers (2024-03-01T05:32:14Z) - Efficient Adaptation of Large Vision Transformer via Adapter
Re-Composing [8.88477151877883]
High-capacity pre-trained models have revolutionized problem-solving in computer vision.
We propose a novel Adapter Re-Composing (ARC) strategy that addresses efficient pre-trained model adaptation.
Our approach considers the reusability of adaptation parameters and introduces a parameter-sharing scheme.
arXiv Detail & Related papers (2023-10-10T01:04:15Z) - A Comprehensive Analysis of Adapter Efficiency [20.63580880344425]
We show that for Natural Language Understanding (NLU) tasks, the parameter efficiency in adapters does not translate to efficiency gains compared to full fine-tuning of models.
We recommend that for moderately sized models for NLU tasks, practitioners should rely on full fine-tuning or multi-task training rather than using adapters.
arXiv Detail & Related papers (2023-05-12T14:05:45Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - Towards Efficient Visual Adaption via Structural Re-parameterization [76.57083043547296]
We propose a parameter-efficient and computational friendly adapter for giant vision models, called RepAdapter.
RepAdapter outperforms full tuning by +7.2% on average and saves up to 25% training time, 20% GPU memory, and 94.6% storage cost of ViT-B/16 on VTAB-1k.
arXiv Detail & Related papers (2023-02-16T06:14:15Z) - AdapterBias: Parameter-efficient Token-dependent Representation Shift
for Adapters in NLP Tasks [55.705355299065474]
Transformer-based pre-trained models with millions of parameters require large storage.
Recent approaches tackle this shortcoming by training adapters, but these approaches still require a relatively large number of parameters.
In this study, AdapterBias, a surprisingly simple yet effective adapter architecture, is proposed.
arXiv Detail & Related papers (2022-04-30T16:49:41Z) - Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties [115.12997212870962]
Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models.
An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance.
In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
arXiv Detail & Related papers (2021-09-10T13:44:46Z) - AdapterDrop: On the Efficiency of Adapters in Transformers [53.845909603631945]
Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements.
Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters.
arXiv Detail & Related papers (2020-10-22T17:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.