A Prompt Learning Framework for Source Code Summarization
- URL: http://arxiv.org/abs/2312.16066v1
- Date: Tue, 26 Dec 2023 14:37:55 GMT
- Title: A Prompt Learning Framework for Source Code Summarization
- Authors: Weisong Sun and Chunrong Fang and Yudu You and Yuchen Chen and Yi Liu
and Chong Wang and Jian Zhang and Quanjun Zhang and Hanwei Qian and Wei Zhao
and Yang Liu and Zhenyu Chen
- Abstract summary: We propose a novel prompt learning framework for code summarization called PromptCS.
PromptCS trains a prompt agent that can generate continuous prompts to unleash the potential for LLMs in code summarization.
We evaluate PromptCS on the CodeSearchNet dataset involving multiple programming languages.
- Score: 24.33455799484519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: (Source) code summarization is the task of automatically generating natural
language summaries for given code snippets. Such summaries play a key role in
helping developers understand and maintain source code. Recently, with the
successful application of large language models (LLMs) in numerous fields,
software engineering researchers have also attempted to adapt LLMs to solve
code summarization tasks. The main adaptation schemes include instruction
prompting and task-oriented fine-tuning. However, instruction prompting
involves designing crafted prompts for zero-shot learning or selecting
appropriate samples for few-shot learning and requires users to have
professional domain knowledge, while task-oriented fine-tuning requires high
training costs. In this paper, we propose a novel prompt learning framework for
code summarization called PromptCS. PromptCS trains a prompt agent that can
generate continuous prompts to unleash the potential for LLMs in code
summarization. Compared to the human-written discrete prompt, the continuous
prompts are produced under the guidance of LLMs and are therefore easier to
understand by LLMs. PromptCS freezes the parameters of LLMs when training the
prompt agent, which can greatly reduce the requirements for training resources.
We evaluate PromptCS on the CodeSearchNet dataset involving multiple
programming languages. The results show that PromptCS significantly outperforms
instruction prompting schemes on all four widely used metrics. In some base
LLMs, e.g., CodeGen-Multi-2B and StarCoderBase-1B and -3B, PromptCS even
outperforms the task-oriented fine-tuning scheme. More importantly, the
training efficiency of PromptCS is faster than the task-oriented fine-tuning
scheme, with a more pronounced advantage on larger LLMs. The results of the
human evaluation demonstrate that PromptCS can generate more good summaries
compared to baselines.
Related papers
- zsLLMCode: An Effective Approach for Functional Code Embedding via LLM with Zero-Shot Learning [6.976968804436321]
Large language models (LLMs) have the capability of zero-shot learning, which does not require training or fine-tuning.
We propose zsLLMCode, a novel approach that generates functional code embeddings using LLMs.
arXiv Detail & Related papers (2024-09-23T01:03:15Z) - What Should We Engineer in Prompts? Training Humans in Requirement-Driven LLM Use [30.933375576806156]
Existing prompt engineering instructions often lack focused training on requirement articulation.
We introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements.
In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training.
arXiv Detail & Related papers (2024-09-13T12:34:14Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest
Neighbor In-Context Learning [50.40636157214161]
Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language.
LLMs have achieved impressive performance in computer programs based on a natural language prompt.
This paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks.
arXiv Detail & Related papers (2023-12-17T17:26:50Z) - Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models [11.845239346943067]
parameter-efficient fine-tuning (PEFT) is a promising approach to efficiently specialize large language models (LLMs) to task-specific data.
Our study highlights the potential for tuning larger LLMs and significant reductions in memory usage by combining PEFT with quantization.
arXiv Detail & Related papers (2023-08-21T04:31:06Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM.
It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses.
We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.