Related papers: Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

URL: http://arxiv.org/abs/2411.02462v1
Date: Mon, 04 Nov 2024 09:03:18 GMT
Title: Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study
Authors: André Storhaug, Jingyue Li,
Abstract summary: Large language models (LLMs) like GitHub Copilot struggle with real-world tasks without fine-tuning. This paper investigates full fine-tuning and various PEFT methods, including LoRA, (IA)3, and prompt tuning. Our findings show that PEFT methods can deliver performance comparable to full fine-tuning for unit test generation.
Score: 3.5189934649278922
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of large language models (LLMs) like GitHub Copilot has significantly enhanced programmers' productivity, particularly in code generation. However, these models often struggle with real-world tasks without fine-tuning. As LLMs grow larger and more performant, fine-tuning for specialized tasks becomes increasingly expensive. Parameter-efficient fine-tuning (PEFT) methods, which fine-tune only a subset of model parameters, offer a promising solution by reducing the computational costs of tuning LLMs while maintaining their performance. Existing studies have explored using PEFT and LLMs for various code-related tasks and found that the effectiveness of PEFT techniques is task-dependent. The application of PEFT techniques in unit test generation remains underexplored. The state-of-the-art is limited to using LLMs with full fine-tuning to generate unit tests. This paper investigates both full fine-tuning and various PEFT methods, including LoRA, (IA)^3, and prompt tuning, across different model architectures and sizes. We use well-established benchmark datasets to evaluate their effectiveness in unit test generation. Our findings show that PEFT methods can deliver performance comparable to full fine-tuning for unit test generation, making specialized fine-tuning more accessible and cost-effective. Notably, prompt tuning is the most effective in terms of cost and resource utilization, while LoRA approaches the effectiveness of full fine-tuning in several cases.

Related papers

From LLMs to Edge: Parameter-Efficient Fine-Tuning on Edge Devices [3.4233698915405544]
This paper benchmarks and analyzes popular PEFT methods on convolutional architectures typically deployed in resource-constrained edge environments.<n>We find that the evaluated PEFT methods are only half as memory-efficient when applied to depthwise-separable convolution architectures.
arXiv Detail & Related papers (2025-07-31T13:23:21Z)
The Impact of Fine-tuning Large Language Models on Automated Program Repair [5.868532677577195]
Automated Program Repair (APR) uses various tools and techniques to help developers achieve functional and error-free code faster.<n>Large Language Models (LLMs) have gained popularity as components in APR tool chains because of their performance and flexibility.<n>Fine-tuning techniques have been developed to adapt pre-trained LLMs to specific tasks, such as APR, and enhance their performance at far lower computational costs than training from scratch.
arXiv Detail & Related papers (2025-07-26T10:42:08Z)
CoLA: Collaborative Low-Rank Adaptation [3.421904493396495]
Fine-tuning a pre-trained model for specific tasks achieves strong performance; however, it is computationally expensive and inefficient.<n>LoRA, in particular, has proven effective, but its application to multi-task scenarios is limited by interference between tasks.<n>We propose CoLA, a more flexible LoRA architecture and three collaborative strategies to enhance performance by better utilizing the quantitative relationships between $A$ and $B$.
arXiv Detail & Related papers (2025-05-21T12:46:42Z)
A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection [11.9757082688031]
Existing detection methods, relying on Codes or Machine Learning (ML) and Deep Learning (DL) techniques, often face limitations such as unsatisfactory performance. This study evaluates state-of-the-art PEFT methods on both small and large Language Models for detecting two types of method-level code smells: Complex Conditional and Complex Method. Results show that PEFT methods achieve comparable or better performance than full fine-tuning while consuming less GPU memory.
arXiv Detail & Related papers (2024-12-18T12:48:36Z)
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models [19.163639128631534]
Importance-aware Sparse Tuning (IST) is a plug-and-play technique compatible with various PEFT methods that operate on a per-layer basis. IST dynamically updates selected layers in PEFT modules, leading to reduced memory demands.
arXiv Detail & Related papers (2024-10-15T16:53:26Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Exploring Parameter-Efficient Fine-Tuning of Large Language Model on Automated Program Repair [5.6679735367798925]
"Pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on Automated Program Repair (APR) We employ prompt engineering to create an instruction dataset, APR-INSTRUCTION, at first to fill this gap. The best fine-tuned model fixes 58% more bugs than the state-of-the-art LLM-based APR techniques.
arXiv Detail & Related papers (2024-06-09T04:42:19Z)
Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R [1.9799527196428242]
We evaluate PEFT methods, LoRA, Compacter, and IA3 on Large Language Models for code summarization and generation. Our experiments reveal that LoRA consistently outperforms Compacter and IA3 in all settings. Our study can direct future research in developing code intelligent tasks for unseen languages including R.
arXiv Detail & Related papers (2024-03-16T03:12:45Z)
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models [90.14693869269519]
MoE LLMs can achieve higher performance with fewer parameters, but it is still hard to deploy them due to their immense parameter sizes. This paper mainly aims to enhance the deployment efficiency of MoE LLMs by introducing plug-and-play expert-level sparsification techniques.
arXiv Detail & Related papers (2024-02-22T18:56:07Z)
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models [20.5908375260123]
Various parameter-efficient fine-tuning (PEFT) techniques have been proposed to enable computationally efficient fine-tuning while maintaining model performance. We present LoRETTA, a framework that significantly reduces trainable parameters through tensor-train decomposition. LoRETTA achieves comparable or better performance than most widely used PEFT methods with up to $100times$ fewer parameters on the LLaMA-2-7B models.
arXiv Detail & Related papers (2024-02-18T01:20:00Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning [13.616908697637665]
We present LLaMA-Reviewer, an innovative framework that leverages the capabilities of LLaMA, a popular LLM, in the realm of code review. This framework employs parameter-efficient fine-tuning (PEFT) methods, delivering high performance while using less than 1% of trainable parameters. To foster continuous progress in this field, the code and all PEFT-weight plugins have been made open-source.
arXiv Detail & Related papers (2023-08-22T03:10:40Z)
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models [12.708117108874083]
Large Language Models (LLMs) generate code snippets given natural language intents in zero-shot, i.e., without the need for specific fine-tuning. Previous research explored In-Context Learning (ICL) as a strategy to guide the LLM generative process with task-specific prompt examples. In this paper, we deliver a comprehensive study of. PEFT techniques for LLMs under the automated code generation scenario.
arXiv Detail & Related papers (2023-08-21T04:31:06Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)
UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning [64.638804236566]
We propose a unified framework, UniPELT, which incorporates different PELT methods as submodules and learns to activate the ones that best suit the current data or task setup. Remarkably, on the GLUE benchmark, UniPELT consistently achieves 13pt gains compared to the best individual PELT method that it incorporates and even outperforms fine-tuning under different setups.
arXiv Detail & Related papers (2021-10-14T17:40:08Z)
CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch. We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.