Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An
Empirical Study
- URL: http://arxiv.org/abs/2402.06247v1
- Date: Fri, 9 Feb 2024 08:40:41 GMT
- Title: Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An
Empirical Study
- Authors: Shuo Liu, Jacky Keung, Zhen Yang, Fang Liu, Qilin Zhou, Yihan Liao
- Abstract summary: PEFT has demonstrated superior performance and lower computational overhead in several code understanding tasks.
It harnesses the pre-trained general-purpose knowledge for downstream tasks.
It remains unclear whether PEFT outperforms FMFT in task-specific adaptation for code-change-related tasks.
- Score: 10.052053069122652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to Full-Model Fine-Tuning (FMFT), Parameter Efficient Fine-Tuning
(PEFT) has demonstrated superior performance and lower computational overhead
in several code understanding tasks, such as code summarization and code
search. This advantage can be attributed to PEFT's ability to alleviate the
catastrophic forgetting issue of Pre-trained Language Models (PLMs) by updating
only a small number of parameters. As a result, PEFT effectively harnesses the
pre-trained general-purpose knowledge for downstream tasks. However, existing
studies primarily involve static code comprehension, aligning with the
pre-training paradigm of recent PLMs and facilitating knowledge transfer, but
they do not account for dynamic code changes. Thus, it remains unclear whether
PEFT outperforms FMFT in task-specific adaptation for code-change-related
tasks. To address this question, we examine two prevalent PEFT methods, namely
Adapter Tuning (AT) and Low-Rank Adaptation (LoRA), and compare their
performance with FMFT on five popular PLMs. Specifically, we evaluate their
performance on two widely-studied code-change-related tasks: Just-In-Time
Defect Prediction (JIT-DP) and Commit Message Generation (CMG). The results
demonstrate that both AT and LoRA achieve state-of-the-art (SOTA) results in
JIT-DP and exhibit comparable performances in CMG when compared to FMFT and
other SOTA approaches. Furthermore, AT and LoRA exhibit superiority in
cross-lingual and low-resource scenarios. We also conduct three probing tasks
to explain the efficacy of PEFT techniques on JIT-DP and CMG tasks from both
static and dynamic perspectives. The study indicates that PEFT, particularly
through the use of AT and LoRA, offers promising advantages in
code-change-related tasks, surpassing FMFT in certain aspects.
Related papers
- A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair [5.6679735367798925]
"Pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on Automated Program Repair (APR)
We employ prompt engineering to create an instruction dataset, APR-INSTRUCTION, at first to fill this gap.
The best fine-tuned model fixes 58% more bugs than the state-of-the-art LLM-based APR techniques.
arXiv Detail & Related papers (2024-06-09T04:42:19Z) - CorDA: Context-Oriented Decomposition Adaptation of Large Language Models [101.81127587760831]
Current parameter-efficient fine-tuning methods build adapters without considering the context of downstream task to learn, or the context of important knowledge to maintain.
We propose CorDA, a Context-oriented Decomposition Adaptation method that builds learnable adapters from weight decomposition oriented by the context of downstream task or world knowledge.
Our knowledge-preserved adaptation not only achieves better performance than LoRA on finetuning tasks, but also mitigates the decomposed of world knowledge.
arXiv Detail & Related papers (2024-06-07T19:10:35Z) - FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition [7.229494183462913]
Despite exceptional performance after fine-tuning, pre-trained language models (PLMs) face significant challenges due to privacy concerns.
We consider federated learning (FL) to fine-tune PLMs in this paper.
One promising solution is to exploit parameter-efficient fine-tuning (PEFT) into FL, which trains a much smaller set of parameters than full parameter fine-tuning (FFT)
arXiv Detail & Related papers (2024-04-29T16:42:26Z) - PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation [61.57833648734164]
We propose a novel Parallel Yielding Re-Activation (PYRA) method for training-inference efficient task adaptation.
PYRA outperforms all competing methods under both low compression rate and high compression rate.
arXiv Detail & Related papers (2024-03-14T09:06:49Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning [12.648711621637663]
This paper introduces a novel.
COCO-Efficient Fine-Tuning (PEFT) framework for multi-modal, multi-task transfer learning with pre-trained language models.
We propose Context-PEFT, which learns different groups of adaptor parameters based on the token's domain.
Our method is evaluated on the captioning task, where it outperforms full fine-tuning under similar data constraints.
arXiv Detail & Related papers (2023-12-14T13:00:24Z) - MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning [28.12788291168137]
We present a multi-task fine-tuning framework, MFTcoder, that enables simultaneous and parallel fine-tuning on multiple tasks.
Experiments have conclusively demonstrated that our multi-task fine-tuning approach outperforms both individual fine-tuning on single tasks and fine-tuning on a mixed ensemble of tasks.
arXiv Detail & Related papers (2023-11-04T02:22:40Z) - LLaMA-Reviewer: Advancing Code Review Automation with Large Language
Models through Parameter-Efficient Fine-Tuning [13.616908697637665]
We present LLaMA-Reviewer, an innovative framework that leverages the capabilities of LLaMA, a popular LLM, in the realm of code review.
This framework employs parameter-efficient fine-tuning (PEFT) methods, delivering high performance while using less than 1% of trainable parameters.
To foster continuous progress in this field, the code and all PEFT-weight plugins have been made open-source.
arXiv Detail & Related papers (2023-08-22T03:10:40Z) - Strong Baselines for Parameter Efficient Few-Shot Fine-tuning [50.83426196335385]
Few-shot classification (FSC) entails learning novel classes given only a few examples per class after a pre-training (or meta-training) phase.
Recent works have shown that simply fine-tuning a pre-trained Vision Transformer (ViT) on new test classes is a strong approach for FSC.
Fine-tuning ViTs, however, is expensive in time, compute and storage.
This has motivated the design of parameter efficient fine-tuning (PEFT) methods which fine-tune only a fraction of the Transformer's parameters.
arXiv Detail & Related papers (2023-04-04T16:14:39Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch.
We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.