GAMMA: Revisiting Template-based Automated Program Repair via Mask
Prediction
- URL: http://arxiv.org/abs/2309.09308v1
- Date: Sun, 17 Sep 2023 15:49:40 GMT
- Title: GAMMA: Revisiting Template-based Automated Program Repair via Mask
Prediction
- Authors: Quanjun Zhang, Chunrong Fang, Tongke Zhang, Bowen Yu, Weisong Sun,
Zhenyu Chen
- Abstract summary: Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns.
In this paper, we propose GAMMA, to directly leverage large pre-trained language models for donor code generation.
Results demonstrate that GAMMA correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59% (14 bugs) and 26.15% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder.
- Score: 14.741742268621403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated program repair (APR) aims to fix software bugs without human
intervention and template-based APR has been widely investigated with promising
results. However, it is challenging for template-based APR to select the
appropriate donor code, which is an important repair ingredient for generating
candidate patches. Inappropriate donor code may cause plausible but incorrect
patch generation even with correct fix patterns, limiting the repair
performance.
In this paper, we aim to revisit template-based APR, and propose GAMMA, to
directly leverage large pre-trained language models for donor code generation.
Our main insight is that instead of retrieving donor code in the local buggy
file, we can directly predict the correct code tokens based on the context code
snippets and repair patterns by a cloze task. Specifically, (1) GAMMA revises a
variety of fix templates from state-of-the-art template-based APR techniques
(i.e., TBar) and transforms them into mask patterns. (2) GAMMA adopts a
pre-trained language model to predict the correct code for masked code as a
fill-in-the-blank task. The experimental results demonstrate that GAMMA
correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59\% (14 bugs)
and 26.15\% (17 bugs) improvement over the previous state-of-the-art
template-based approach TBar and learning-based one Recoder. Furthermore, GAMMA
repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs,
indicating the generalizability of GAMMA in addressing the dataset overfitting
issue. We also prove that adopting other pre-trained language models can
provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based GAMMA
is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of
GAMMA. Overall, our study highlights the promising future of adopting
pre-trained models to generate correct patches on top of fix patterns.
Related papers
- A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Automated Bug Generation in the era of Large Language Models [6.0770779409377775]
BugFarm transforms arbitrary code into multiple complex bugs.
A comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BUGFARM.
arXiv Detail & Related papers (2023-10-03T20:01:51Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Domain Knowledge Matters: Improving Prompts with Fix Templates for
Repairing Python Type Errors [41.87781274165405]
There exist rule-based approaches for automatically repairing Python type errors.
The approaches can generate accurate patches but they require domain experts to design patch synthesis rules.
In this paper, we present TypeFix, a novel prompt-based approach with fix templates incorporated for repairing Python type errors.
arXiv Detail & Related papers (2023-06-02T09:42:16Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope.
We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - Generating Bug-Fixes Using Pretrained Transformers [11.012132897417592]
We introduce a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub.
We show that pretraining on source code programs improves the number of patches found by 33% as compared to supervised training from scratch.
We refine the standard accuracy evaluation metric into non-deletion and deletion-only fixes, and show that our best model generates 75% more non-deletion fixes than the previous state of the art.
arXiv Detail & Related papers (2021-04-16T05:27:04Z) - CURE: Code-Aware Neural Machine Translation for Automatic Program Repair [11.556110575946631]
We propose CURE, a new NMT-based APR technique with three major novelties.
CURE pre-trains a programming language (PL) model on a large software to learn developer-like source code before the APR task.
Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code.
arXiv Detail & Related papers (2021-02-26T22:30:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.