Towards Practical and Useful Automated Program Repair for Debugging
- URL: http://arxiv.org/abs/2407.08958v1
- Date: Fri, 12 Jul 2024 03:19:54 GMT
- Title: Towards Practical and Useful Automated Program Repair for Debugging
- Authors: Qi Xin, Haojun Wu, Steven P. Reiss, Jifeng Xuan,
- Abstract summary: PracAPR is an interactive repair system that works in an Integrated Development Environment (IDE)
PracAPR does not require a test suite or program re-execution.
- Score: 4.216808129651161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current automated program repair (APR) techniques are far from being practical and useful enough to be considered for realistic debugging. They rely on unrealistic assumptions including the requirement of a comprehensive suite of test cases as the correctness criterion and frequent program re-execution for patch validation; they are not fast; and their ability of repairing the commonly arising complex bugs by fixing multiple locations of the program is very limited. We hope to substantially improve APR's practicality, effectiveness, and usefulness to help people debug. Towards this goal, we envision PracAPR, an interactive repair system that works in an Integrated Development Environment (IDE) to provide effective repair suggestions for debugging. PracAPR does not require a test suite or program re-execution. It assumes that the developer uses an IDE debugger and the program has suspended at a location where a problem is observed. It interacts with the developer to obtain a problem specification. Based on the specification, it performs test-free, flow-analysis-based fault localization, patch generation that combines large language model-based local repair and tailored strategy-driven global repair, and program re-execution-free patch validation based on simulated trace comparison to suggest repairs. By having PracAPR, we hope to take a significant step towards making APR useful and an everyday part of debugging.
Related papers
- The Impact of Program Reduction on Automated Program Repair [0.3277163122167433]
We describe a program repair approach that aims to improve the scalability of modern APR tools.
We investigate slicing's impact on all three phases of the repair process: fault localization, patch generation, and patch validation.
We conclude that program reduction can improve the performance of APR without degrading repair quality, but this improvement is not universal.
arXiv Detail & Related papers (2024-08-02T09:23:45Z) - On The Effectiveness of Dynamic Reduction Techniques in Automated Program Repair [1.7767466724342067]
We describe a program repair framework that effectively handles large-scale buggy programs of industrial complexity.
The framework exploits program reduction in the form of program slicing to eliminate parts of the code irrelevant to the bug being repaired.
Our empirical results on the widely used Defects4J dataset reveal that a substantial improvement in performance can be obtained without any degradation in repair quality.
arXiv Detail & Related papers (2024-06-23T21:35:07Z) - Investigating the Transferability of Code Repair for Low-Resource Programming Languages [57.62712191540067]
Large language models (LLMs) have shown remarkable performance on code generation tasks.
Recent works augment the code repair process by integrating modern techniques such as chain-of-thought reasoning or distillation.
We investigate the benefits of distilling code repair for both high and low resource languages.
arXiv Detail & Related papers (2024-06-21T05:05:39Z) - VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step.
V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy.
Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z) - NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs.
We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z) - ContrastRepair: Enhancing Conversation-Based Automated Program Repair
via Contrastive Test Case Pairs [23.419180504723546]
ContrastRepair is a novel APR approach that augments conversation-driven APR by providing contrastive test pairs.
We evaluate ContrastRepair on multiple benchmark datasets, including Defects4j, QuixBugs, and HumanEval-Java.
arXiv Detail & Related papers (2024-03-04T12:15:28Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Practical Program Repair via Preference-based Ensemble Strategy [28.176710503313895]
We propose a Preference-based Ensemble Program Repair framework (P-EPR) to rank APR tools for repairing different bugs.
P-EPR is the first non-learning-based APR ensemble method that is novel in its exploitation of repair patterns.
Experimental results show that P-EPR outperforms existing strategies significantly both in flexibility and effectiveness.
arXiv Detail & Related papers (2023-09-15T07:23:04Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - FixEval: Execution-based Evaluation of Program Fixes for Programming
Problems [23.987104440395576]
We introduce FixEval, a benchmark comprising of buggy code submissions to competitive programming problems and their corresponding fixes.
FixEval offers an extensive collection of unit tests to evaluate the correctness of model-generated program fixes.
Our experiments show that match-based metrics do not reflect model-generated program fixes accurately.
arXiv Detail & Related papers (2022-06-15T20:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.