Adversarial Patch Generation for Automated Program Repair
- URL: http://arxiv.org/abs/2012.11060v4
- Date: Sun, 3 Sep 2023 23:33:52 GMT
- Title: Adversarial Patch Generation for Automated Program Repair
- Authors: Abdulaziz Alhefdhi (1 and 2), Hoa Khanh Dam (1), Thanh Le-Cong (3),
Bach Le (3), Aditya Ghose (1) ((1) University of Wollongong, (2) Prince
Sattam bin Abdulaziz University, (3) The University of Melbourne)
- Abstract summary: NEVERMORE is a novel learning-based mechanism inspired by the adversarial nature of bugs and fixes.
NEVERMORE is built upon the Generative Adrial Networks architecture and trained on historical bug fixes to generate repairs that closely mimic human-produced fixes.
Our empirical evaluation on 500 real-world bugs demonstrates the effectiveness of NEVERMORE in bug-fixing, generating repairs that match human fixes for 21.2% of the examined bugs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated Program Repair has attracted significant research in recent years,
leading to diverse techniques that focus on two main directions: search-based
and semantic-based program repair. The former techniques often face challenges
due to the vast search space, resulting in difficulties in identifying correct
solutions, while the latter approaches are constrained by the capabilities of
the underlying semantic analyser, limiting their scalability. In this paper, we
propose NEVERMORE, a novel learning-based mechanism inspired by the adversarial
nature of bugs and fixes. NEVERMORE is built upon the Generative Adversarial
Networks architecture and trained on historical bug fixes to generate repairs
that closely mimic human-produced fixes. Our empirical evaluation on 500
real-world bugs demonstrates the effectiveness of NEVERMORE in bug-fixing,
generating repairs that match human fixes for 21.2% of the examined bugs.
Moreover, we evaluate NEVERMORE on the Defects4J dataset, where our approach
generates repairs for 4 bugs that remained unresolved by state-of-the-art
baselines. NEVERMORE also fixes another 8 bugs which were only resolved by a
subset of these baselines. Finally, we conduct an in-depth analysis of the
impact of input and training styles on NEVERMORE's performance, revealing where
the chosen style influences the model's bug-fixing capabilities.
Related papers
- ThinkRepair: Self-Directed Automated Program Repair [11.598008952093487]
Large language models (LLMs) instructed by prompt engineering have attracted much attention for their powerful ability to address many kinds of tasks including bug-fixing.
We propose a self-directed LLM-based automated program repair, ThinkRepair, with two main phases: collection phase and fixing phase.
Evaluations on two widely studied datasets (Defects4J and QuixBugs) by comparing ThinkRepair with 12 SOTA APRs indicate the priority of ThinkRepair in fixing bugs.
arXiv Detail & Related papers (2024-07-30T15:17:07Z) - Leveraging Large Language Models for Efficient Failure Analysis in Game Development [47.618236610219554]
This paper proposes a new approach to automatically identify which change in the code caused a test to fail.
The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure.
Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year.
arXiv Detail & Related papers (2024-06-11T09:21:50Z) - Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis [12.7034916462208]
Automated Program Repair (APR) has garnered significant attention due to its potential to streamline the bug repair process for human developers.
This paper introduces an innovative APR approach called GIANTREPAIR.
Based on this insight, GIANTREPAIR first constructs patch skeletons from LLM-generated patches to confine the patch space, and then generates high-quality patches tailored to specific programs.
arXiv Detail & Related papers (2024-06-03T05:05:12Z) - A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR)
In this study, we take a deep dive into automated bug fixing utilizing LLMs.
This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information.
Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z) - Enhancing Redundancy-based Automated Program Repair by Fine-grained
Pattern Mining [18.3896381051331]
We propose a new repair technique named Repatt, which incorporates a two-level pattern mining process for guiding effective patch generation.
We have conducted an experiment on the widely-used Defects4J benchmark and compared Repatt with eight state-of-the-art APR approaches.
arXiv Detail & Related papers (2023-12-26T08:42:32Z) - Automated Bug Generation in the era of Large Language Models [6.0770779409377775]
BugFarm transforms arbitrary code into multiple complex bugs.
A comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BUGFARM.
arXiv Detail & Related papers (2023-10-03T20:01:51Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Is Self-Repair a Silver Bullet for Code Generation? [68.02601393906083]
Large language models have shown remarkable aptitude in code generation, but still struggle to perform complex tasks.
Self-repair -- in which the model debugs and repairs its own code -- has recently become a popular way to boost performance.
We analyze Code Llama, GPT-3.5 and GPT-4's ability to perform self-repair on problems taken from HumanEval and APPS.
arXiv Detail & Related papers (2023-06-16T15:13:17Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - Graph-based, Self-Supervised Program Repair from Diagnostic Feedback [108.48853808418725]
We introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback.
We then apply a graph neural network on top to model the reasoning process.
We present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online.
arXiv Detail & Related papers (2020-05-20T07:24:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.