A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on
Software Engineering Tasks
- URL: http://arxiv.org/abs/2312.15614v1
- Date: Mon, 25 Dec 2023 05:25:39 GMT
- Title: A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on
Software Engineering Tasks
- Authors: Wentao Zou and Qi Li and Jidong Ge and Chuanyi Li and Xiaoyu Shen and
Liguo Huang and Bin Luo
- Abstract summary: Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks.
A widely used solution is parameter-efficient fine-tuning (PEFT), which freezes PTMs while introducing extra parameters.
This paper aims to evaluate the effectiveness of five PEFT methods on eight PTMs and four SE downstream tasks.
- Score: 29.88525311985907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained models (PTMs) have achieved great success in various Software
Engineering (SE) downstream tasks following the ``pre-train then fine-tune''
paradigm. As fully fine-tuning all parameters of PTMs can be computationally
expensive, a widely used solution is parameter-efficient fine-tuning (PEFT),
which freezes PTMs while introducing extra parameters. Though work has been
done to test PEFT methods in the SE field, a comprehensive evaluation is still
lacking. This paper aims to fill in this gap by evaluating the effectiveness of
five PEFT methods on eight PTMs and four SE downstream tasks. For different
tasks and PEFT methods, we seek answers to the following research questions: 1)
Is it more effective to use PTMs trained specifically on source code, or is it
sufficient to use PTMs trained on natural language text? 2) What is the impact
of varying model sizes? 3) How does the model architecture affect the
performance? Besides effectiveness, we also discuss the efficiency of PEFT
methods, concerning the costs of required training time and GPU resource
consumption. We hope that our findings can provide a deeper understanding of
PEFT methods on various PTMs and SE downstream tasks. All the codes and data
are available at \url{https://github.com/zwtnju/PEFT.git}.
Related papers
- Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study [3.5189934649278922]
Large language models (LLMs) like GitHub Copilot struggle with real-world tasks without fine-tuning.
This paper investigates full fine-tuning and various PEFT methods, including LoRA, (IA)3, and prompt tuning.
Our findings show that PEFT methods can deliver performance comparable to full fine-tuning for unit test generation.
arXiv Detail & Related papers (2024-11-04T09:03:18Z) - BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models [63.52035708182815]
We introduce a novel Budget-guided Iterative search strategy for automatic PEFT (BIPEFT)
BIPEFT employs a new iterative search strategy to disentangle the binary module and rank dimension search spaces.
Extensive experiments on public benchmarks demonstrate the superior performance of BIPEFT for downstream tasks with a low parameter budget.
arXiv Detail & Related papers (2024-10-04T18:50:46Z) - Exploring Parameter-Efficient Fine-Tuning of Large Language Model on Automated Program Repair [5.6679735367798925]
"Pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on Automated Program Repair (APR)
We employ prompt engineering to create an instruction dataset, APR-INSTRUCTION, at first to fill this gap.
The best fine-tuned model fixes 58% more bugs than the state-of-the-art LLM-based APR techniques.
arXiv Detail & Related papers (2024-06-09T04:42:19Z) - Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning [17.032155725171958]
We propose the Light-PEFT framework, which includes two methods: Masked Early Pruning of the Foundation Model and Multi-Granularity Early Pruning of PEFT.
Compared to utilizing the PEFT method directly, Light-PEFT achieves training and inference speedup, reduces memory usage, and maintains comparable performance.
arXiv Detail & Related papers (2024-06-06T07:03:29Z) - From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers [52.199303258423306]
We propose a novel density loss that encourages higher activation sparsity in pre-trained models.
Our proposed method, textbfDEFT, can consistently reduce activation density by up to textbf44.94% on RoBERTa$_mathrmLarge$ and by textbf53.19% (encoder density) and textbf90.60% (decoder density) on Flan-T5$_mathrmXXL$.
arXiv Detail & Related papers (2024-02-02T21:25:46Z) - Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models:
A Critical Review and Assessment [12.674032145667763]
We present a comprehensive and systematic review of Efficient Fine-Tuning (PEFT) methods for pretrained language models (PLMs)
PEFT offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning.
We conduct experiments using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency.
arXiv Detail & Related papers (2023-12-19T13:31:24Z) - ComPEFT: Compression for Communicating Parameter Efficient Updates via
Sparsification and Quantization [100.90624220423634]
We present ComPEFT, a novel method for compressing fine-tuning residuals (task vectors) of PEFT based models.
In extensive evaluation across T5, T0, and LLaMA-based models with 200M - 65B parameters, ComPEFT achieves compression ratios of 8x - 50x.
arXiv Detail & Related papers (2023-11-22T05:28:59Z) - Strong Baselines for Parameter Efficient Few-Shot Fine-tuning [50.83426196335385]
Few-shot classification (FSC) entails learning novel classes given only a few examples per class after a pre-training (or meta-training) phase.
Recent works have shown that simply fine-tuning a pre-trained Vision Transformer (ViT) on new test classes is a strong approach for FSC.
Fine-tuning ViTs, however, is expensive in time, compute and storage.
This has motivated the design of parameter efficient fine-tuning (PEFT) methods which fine-tune only a fraction of the Transformer's parameters.
arXiv Detail & Related papers (2023-04-04T16:14:39Z) - When does Parameter-Efficient Transfer Learning Work for Machine
Translation? [8.862707047517913]
Prior work indicates that PEFTs may not work as well for machine translation (MT)
We conduct a comprehensive empirical study of PEFTs for MT, considering (1) various parameter budgets, (2) a diverse set of language-pairs, and (3) different pre-trained models.
We find that using PEFTs with a larger pre-trained model outperforms full fine-tuning with a smaller model, and for smaller training data sizes, PEFTs outperform full fine-tuning for the same pre-trained model.
arXiv Detail & Related papers (2022-05-23T12:49:46Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch.
We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.