The Right Prompts for the Job: Repair Code-Review Defects with Large
Language Model
- URL: http://arxiv.org/abs/2312.17485v1
- Date: Fri, 29 Dec 2023 06:12:15 GMT
- Title: The Right Prompts for the Job: Repair Code-Review Defects with Large
Language Model
- Authors: Zelin Zhao, Zhaogui Xu, Jialong Zhu, Peng Di, Yuan Yao, Xiaoxing Ma
- Abstract summary: Automatic program repair (APR) techniques have the potential to reduce manual efforts in uncovering and repairing program defects during the code review (CR) process.
However, the limited accuracy and considerable time costs associated with existing APR approaches hinder their adoption in industrial practice.
Recent advancements in Large Language Models (LLMs) have enhanced their ability to comprehend natural and programming languages, enabling them to generate patches based on review comments.
- Score: 15.885824575879763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic program repair (APR) techniques have the potential to reduce manual
efforts in uncovering and repairing program defects during the code review (CR)
process. However, the limited accuracy and considerable time costs associated
with existing APR approaches hinder their adoption in industrial practice. One
key factor is the under-utilization of review comments, which provide valuable
insights into defects and potential fixes. Recent advancements in Large
Language Models (LLMs) have enhanced their ability to comprehend natural and
programming languages, enabling them to generate patches based on review
comments. This paper conducts a comprehensive investigation into the effective
utilization of LLMs for repairing CR defects. In this study, various prompts
are designed and compared across mainstream LLMs using two distinct datasets
from human reviewers and automated checkers. Experimental results demonstrate a
remarkable repair rate of 72.97% with the best prompt, highlighting a
substantial improvement in the effectiveness and practicality of automatic
repair techniques.
Related papers
- On The Effectiveness of Dynamic Reduction Techniques in Automated Program Repair [1.7767466724342067]
We describe a program repair framework that effectively handles large-scale buggy programs of industrial complexity.
The framework exploits program reduction in the form of program slicing to eliminate parts of the code irrelevant to the bug being repaired.
Our empirical results on the widely used Defects4J dataset reveal that a substantial improvement in performance can be obtained without any degradation in repair quality.
arXiv Detail & Related papers (2024-06-23T21:35:07Z) - CREF: An LLM-based Conversational Software Repair Framework for Programming Tutors [8.415004837059863]
It is crucial to recognize that existing repair benchmarks may have influenced LLM training data, potentially causing data leakage.
Our work assesses the repair performance of 12 LLMs on TutorCode, measuring repair correctness (TOP-5 and AVG-5) and patch precision (RPSR)
To fully harness LLMs' conversational capabilities and the benefits of augmented information, we introduce a novel conversational semi-automatic repair framework CREF assisting human tutor.
arXiv Detail & Related papers (2024-06-20T03:36:34Z) - A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback [7.742213291781287]
We present VRpilot, a vulnerability repair technique based on reasoning and patch validation feedback.
Our results show that VRpilot generates, on average, 14% and 7.6% more correct patches than the baseline techniques on C and Java.
arXiv Detail & Related papers (2024-05-24T16:29:48Z) - How Far Can We Go with Practical Function-Level Program Repair? [12.195137917098041]
This paper investigates the effect of few-shot learning mechanism and the auxiliary repair-relevant information on function-level APR.
We propose an LLM-based function-level APR technique, namely SRepair, which adopts a dual-LLM framework to leverage the power of the auxiliary repair-relevant information.
arXiv Detail & Related papers (2024-04-19T12:14:09Z) - Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models [95.96734086126469]
Large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications.
For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work.
We perform a detailed coarse-to-fine analysis of the inference performance of various code libraries.
arXiv Detail & Related papers (2024-04-17T15:57:50Z) - An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications [5.395614997568524]
This paper presents a systematic investigation into the capacity of Large Language Models (LLMs) for repairing declarative specifications in Alloy.
We propose a novel repair pipeline that integrates a dual-agent LLM framework, comprising a Repair Agent and a Prompt Agent.
Our study reveals that LLMs, particularly GPT-4 variants, outperform existing techniques in terms of repair efficacy, albeit with a marginal increase in runtime and token usage.
arXiv Detail & Related papers (2024-04-17T03:46:38Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Cross-modal Active Complementary Learning with Self-refining
Correspondence [54.61307946222386]
We propose a Cross-modal Robust Complementary Learning framework (CRCL) to improve the robustness of existing methods.
ACL exploits active and complementary learning losses to reduce the risk of providing erroneous supervision.
SCC utilizes multiple self-refining processes with momentum correction to enlarge the receptive field for correcting correspondences.
arXiv Detail & Related papers (2023-10-26T15:15:11Z) - Large Language Models Cannot Self-Correct Reasoning Yet [78.16697476530994]
Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities.
Concerns persist regarding the accuracy and appropriateness of their generated content.
A contemporary methodology, self-correction, has been proposed as a remedy to these issues.
arXiv Detail & Related papers (2023-10-03T04:56:12Z) - Automatically Correcting Large Language Models: Surveying the landscape
of diverse self-correction strategies [104.32199881187607]
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output.
This paper presents a comprehensive review of this emerging class of techniques.
arXiv Detail & Related papers (2023-08-06T18:38:52Z) - Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs.
We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal.
Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.