Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation
with Large Language Models
- URL: http://arxiv.org/abs/2308.10462v2
- Date: Thu, 18 Jan 2024 15:37:33 GMT
- Title: Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation
with Large Language Models
- Authors: Martin Weyssow, Xin Zhou, Kisub Kim, David Lo and Houari Sahraoui
- Abstract summary: Large Language Models (LLMs) generate code snippets given natural language intents in zero-shot, i.e., without the need for specific fine-tuning.
Previous research explored In-Context Learning (ICL) as a strategy to guide the LLM generative process with task-specific prompt examples.
In this paper, we deliver a comprehensive study of.
PEFT techniques for LLMs under the automated code generation scenario.
- Score: 12.708117108874083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) demonstrate impressive capabilities to generate
accurate code snippets given natural language intents in zero-shot, i.e.,
without the need for specific fine-tuning. While prior studies have highlighted
the advantages of fine-tuning LLMs, this process incurs high computational
costs, making it impractical in resource-scarce environments, particularly for
models with billions of parameters. To address these challenges, previous
research explored In-Context Learning (ICL) as a strategy to guide the LLM
generative process with task-specific prompt examples. However, ICL introduces
inconveniences, such as the need for designing contextually relevant prompts
and the absence of learning task-specific parameters, thereby limiting
downstream task performance. In this context, we foresee Parameter-Efficient
Fine-Tuning (PEFT) techniques as a promising approach to efficiently specialize
LLMs to task-specific data while maintaining reasonable resource consumption.
In this paper, we deliver a comprehensive study of PEFT techniques for LLMs
under the automated code generation scenario. Our comprehensive investigation
of PEFT techniques for LLMs reveals their superiority and potential over ICL
across a diverse set of LLMs. Additionally, we demonstrate the extended
capabilities of PEFT, showcasing its ability to learn from two distinct
datasets jointly without compromising performance. Furthermore, our study
highlights the potential for tuning larger LLMs and significant reductions in
memory usage by combining PEFT with quantization. Therefore, this study opens
opportunities for broader applications of PEFT in software engineering
scenarios. Our code is available at
https://github.com/martin-wey/peft-llm-code/.
Related papers
- Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - zsLLMCode: An Effective Approach for Functional Code Embedding via LLM with Zero-Shot Learning [6.976968804436321]
Large language models (LLMs) have the capability of zero-shot learning, which does not require training or fine-tuning.
We propose zsLLMCode, a novel approach that generates functional code embeddings using LLMs.
arXiv Detail & Related papers (2024-09-23T01:03:15Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - New Solutions on LLM Acceleration, Optimization, and Application [14.995654657013741]
Large Language Models (LLMs) have become extremely potent instruments with exceptional capacities for comprehending and producing human-like text in a range of applications.
However, the increasing size and complexity of LLMs present significant challenges in both training and deployment.
We provide a review of recent advancements and research directions aimed at addressing these challenges.
arXiv Detail & Related papers (2024-06-16T11:56:50Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices [32.61693246340064]
We study how a resource-constrained computing environment would affect the design choices for a personalized LLM.
We consider the tradeoffs among a number of key design factors and their intertwined impacts on learning efficiency and accuracy.
arXiv Detail & Related papers (2024-06-06T06:41:53Z) - Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning [35.03338699349037]
We propose a novel in-context learning framework, FeatLLM, which employs Large Language Models as feature engineers.
FeatLLM generates high-quality rules, significantly (10% on average) outperforming alternatives such as TabLLM and STUNT.
arXiv Detail & Related papers (2024-04-15T06:26:08Z) - Towards Better Parameter-Efficient Fine-Tuning for Large Language
Models: A Position Paper [14.081178100662163]
This paper delves into the pressing need in.
-Efficient Fine-Tuning (PEFT) for Large Language Models (LLMs)
Our position paper highlights current states and the necessity of further studying into the topic.
arXiv Detail & Related papers (2023-11-22T03:28:34Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.