Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An
Empirical Study
- URL: http://arxiv.org/abs/2402.06247v1
- Date: Fri, 9 Feb 2024 08:40:41 GMT
- Title: Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An
Empirical Study
- Authors: Shuo Liu, Jacky Keung, Zhen Yang, Fang Liu, Qilin Zhou, Yihan Liao
- Abstract summary: PEFT has demonstrated superior performance and lower computational overhead in several code understanding tasks.
It harnesses the pre-trained general-purpose knowledge for downstream tasks.
It remains unclear whether PEFT outperforms FMFT in task-specific adaptation for code-change-related tasks.
- Score: 10.052053069122652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to Full-Model Fine-Tuning (FMFT), Parameter Efficient Fine-Tuning
(PEFT) has demonstrated superior performance and lower computational overhead
in several code understanding tasks, such as code summarization and code
search. This advantage can be attributed to PEFT's ability to alleviate the
catastrophic forgetting issue of Pre-trained Language Models (PLMs) by updating
only a small number of parameters. As a result, PEFT effectively harnesses the
pre-trained general-purpose knowledge for downstream tasks. However, existing
studies primarily involve static code comprehension, aligning with the
pre-training paradigm of recent PLMs and facilitating knowledge transfer, but
they do not account for dynamic code changes. Thus, it remains unclear whether
PEFT outperforms FMFT in task-specific adaptation for code-change-related
tasks. To address this question, we examine two prevalent PEFT methods, namely
Adapter Tuning (AT) and Low-Rank Adaptation (LoRA), and compare their
performance with FMFT on five popular PLMs. Specifically, we evaluate their
performance on two widely-studied code-change-related tasks: Just-In-Time
Defect Prediction (JIT-DP) and Commit Message Generation (CMG). The results
demonstrate that both AT and LoRA achieve state-of-the-art (SOTA) results in
JIT-DP and exhibit comparable performances in CMG when compared to FMFT and
other SOTA approaches. Furthermore, AT and LoRA exhibit superiority in
cross-lingual and low-resource scenarios. We also conduct three probing tasks
to explain the efficacy of PEFT techniques on JIT-DP and CMG tasks from both
static and dynamic perspectives. The study indicates that PEFT, particularly
through the use of AT and LoRA, offers promising advantages in
code-change-related tasks, surpassing FMFT in certain aspects.
Related papers
- Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study [3.5189934649278922]
Large language models (LLMs) like GitHub Copilot struggle with real-world tasks without fine-tuning.
This paper investigates full fine-tuning and various PEFT methods, including LoRA, (IA)3, and prompt tuning.
Our findings show that PEFT methods can deliver performance comparable to full fine-tuning for unit test generation.
arXiv Detail & Related papers (2024-11-04T09:03:18Z) - Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models [24.62337386603331]
Large Multi-modal Models (LMMs) are revolutionizing the way machines interact with the world.
To adapt LMMs for downstream tasks, parameter-efficient fine-tuning (PEFT) has gained popularity.
This paper focuses on the strengths and weaknesses of each tuning strategy, shifting the focus from the efficiency typically associated with these approaches.
arXiv Detail & Related papers (2024-10-29T07:55:50Z) - Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models [19.163639128631534]
Importance-aware Sparse Tuning (IST) is a plug-and-play technique compatible with various PEFT methods that operate on a per-layer basis.
IST dynamically updates selected layers in PEFT modules, leading to reduced memory demands.
arXiv Detail & Related papers (2024-10-15T16:53:26Z) - GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs [51.02233412547456]
We introduce a novel PEFT method, called Gaussian noise Injected Fine Tuning of Salient Weights (GIFT-SW)
Our method updates only salient columns, while injecting Gaussian noise into non-salient ones.
Experiments with LLaMA models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT methods under the same computational budget.
arXiv Detail & Related papers (2024-08-27T14:41:14Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Exploring Parameter-Efficient Fine-Tuning of Large Language Model on Automated Program Repair [5.6679735367798925]
"Pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on Automated Program Repair (APR)
We employ prompt engineering to create an instruction dataset, APR-INSTRUCTION, at first to fill this gap.
The best fine-tuned model fixes 58% more bugs than the state-of-the-art LLM-based APR techniques.
arXiv Detail & Related papers (2024-06-09T04:42:19Z) - FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition [7.229494183462913]
Despite exceptional performance after fine-tuning, pre-trained language models (PLMs) face significant challenges due to privacy concerns.
We consider federated learning (FL) to fine-tune PLMs in this paper.
One promising solution is to exploit parameter-efficient fine-tuning (PEFT) into FL, which trains a much smaller set of parameters than full parameter fine-tuning (FFT)
arXiv Detail & Related papers (2024-04-29T16:42:26Z) - PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation [61.57833648734164]
We propose a novel Parallel Yielding Re-Activation (PYRA) method for training-inference efficient task adaptation.
PYRA outperforms all competing methods under both low compression rate and high compression rate.
arXiv Detail & Related papers (2024-03-14T09:06:49Z) - Strong Baselines for Parameter Efficient Few-Shot Fine-tuning [50.83426196335385]
Few-shot classification (FSC) entails learning novel classes given only a few examples per class after a pre-training (or meta-training) phase.
Recent works have shown that simply fine-tuning a pre-trained Vision Transformer (ViT) on new test classes is a strong approach for FSC.
Fine-tuning ViTs, however, is expensive in time, compute and storage.
This has motivated the design of parameter efficient fine-tuning (PEFT) methods which fine-tune only a fraction of the Transformer's parameters.
arXiv Detail & Related papers (2023-04-04T16:14:39Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch.
We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.