Conversational Automated Program Repair
- URL: http://arxiv.org/abs/2301.13246v1
- Date: Mon, 30 Jan 2023 19:22:36 GMT
- Title: Conversational Automated Program Repair
- Authors: Chunqiu Steven Xia, Lingming Zhang
- Abstract summary: We propose a new paradigm for program repair that alternates between patch generation and validation in a conversational manner.
We leverage the long-term context window of Large Pre-Trained Language Models to not only avoid generating previously incorrect patches but also incorporate validation feedback to help the model understand the semantic meaning of the program under test.
- Score: 10.071615423169902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated Program Repair (APR) can help developers automatically generate
patches for bugs. Due to the impressive performance obtained using Large
Pre-Trained Language Models (LLMs) on many code related tasks, researchers have
started to directly use LLMs for APR. However, prior approaches simply
repeatedly sample the LLM given the same constructed input/prompt created from
the original buggy code, which not only leads to generating the same incorrect
patches repeatedly but also miss the critical information in testcases. To
address these limitations, we propose conversational APR, a new paradigm for
program repair that alternates between patch generation and validation in a
conversational manner. In conversational APR, we iteratively build the input to
the model by combining previously generated patches with validation feedback.
As such, we leverage the long-term context window of LLMs to not only avoid
generating previously incorrect patches but also incorporate validation
feedback to help the model understand the semantic meaning of the program under
test. We evaluate 10 different LLM including the newly developed ChatGPT model
to demonstrate the improvement of conversational APR over the prior LLM for APR
approach.
Related papers
- Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models [9.454475517867817]
We propose a patch-naturalness measurement, entropy-delta, to improve the efficiency of template-based repair techniques.
Our proposed method can rank correct patches more effectively than state-of-the-art machine learning tools.
arXiv Detail & Related papers (2024-04-23T17:12:45Z) - Aligning LLMs for FL-free Program Repair [14.935596175148586]
This paper investigates a new approach to adapt large language models (LLMs) to program repair.
Our core insight is that LLM's APR capability can be greatly improved by simply aligning the output to their training objective.
Based on this insight, we designed D4C, a straightforward prompting framework for APR.
arXiv Detail & Related papers (2024-04-13T02:36:40Z) - ContrastRepair: Enhancing Conversation-Based Automated Program Repair
via Contrastive Test Case Pairs [23.419180504723546]
ContrastRepair is a novel APR approach that augments conversation-driven APR by providing contrastive test pairs.
We evaluate ContrastRepair on multiple benchmark datasets, including Defects4j, QuixBugs, and HumanEval-Java.
arXiv Detail & Related papers (2024-03-04T12:15:28Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code? [51.29970742152668]
We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities.
To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
arXiv Detail & Related papers (2024-01-03T02:44:02Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each
using ChatGPT [10.071615423169902]
Automated Program Repair (APR) aims to automatically generate patches for buggy programs.
Recent APR work has been focused on leveraging modern Large Language Models (LLMs) to directly generate patches for APR.
We propose ChatRepair, the first fully automated conversation-driven APR approach.
arXiv Detail & Related papers (2023-04-01T20:57:33Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.