Repair Is Nearly Generation: Multilingual Program Repair with LLMs
- URL: http://arxiv.org/abs/2208.11640v1
- Date: Wed, 24 Aug 2022 16:25:58 GMT
- Title: Repair Is Nearly Generation: Multilingual Program Repair with LLMs
- Authors: Harshit Joshi, Jos\'e Cambronero, Sumit Gulwani, Vu Le, Ivan Radicek,
Gust Verbruggen
- Abstract summary: We introduce RING, a multilingual repair engine powered by a large language model trained on code (LLMC) such as Codex.
Taking inspiration from the way programmers manually fix bugs, we show that a prompt-based strategy that conceptualizes repair as localization, transformation, and candidate ranking, can successfully repair programs in multiple domains with minimal effort.
- Score: 9.610685299268825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most programmers make mistakes when writing code. Some of these mistakes are
small and require few edits to the original program - a class of errors
recently termed last mile mistakes. These errors break the flow for experienced
developers and can stump novice programmers. Existing automated repair
techniques targeting this class of errors are domain-specific and do not easily
carry over to new domains. Transferring symbolic approaches requires
substantial engineering and neural approaches require data and retraining. We
introduce RING, a multilingual repair engine powered by a large language model
trained on code (LLMC) such as Codex. Such a multilingual engine enables a
flipped model for programming assistance, one where the programmer writes code
and the AI assistance suggests fixes, compared to traditional code suggestion
technology. Taking inspiration from the way programmers manually fix bugs, we
show that a prompt-based strategy that conceptualizes repair as localization,
transformation, and candidate ranking, can successfully repair programs in
multiple domains with minimal effort. We present the first results for such a
multilingual repair engine by evaluating on 6 different domains and comparing
performance to domain-specific repair engines. We show that RING can outperform
domain-specific repair engines in 3 of these domains. We also identify
directions for future research using LLMCs for multilingual repair.
Related papers
- DistiLRR: Transferring Code Repair for Low-Resource Programming Languages [57.62712191540067]
Distilling Low-Resource Repairs (DistiLRR) is an approach that transfers the reasoning and code generation ability from a teacher model to a student model.
Our results show that DistiLRR consistently outperforms baselines on low-resource languages, but has similar performance on high-resource languages.
arXiv Detail & Related papers (2024-06-21T05:05:39Z) - NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs.
We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z) - A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR)
In this study, we take a deep dive into automated bug fixing utilizing LLMs.
This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information.
Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair [8.321263361036808]
We propose RepairLLaMA, a novel program repair approach that identifies optimal code representations for APR with fine-tuned models.
This results in a highly effective program repair adapter' for fixing bugs with AI.
Overall, RepairLLaMA correctly fixes 144 Defects4J v2 and 109 HumanEval-Java bugs, outperforming all baselines.
arXiv Detail & Related papers (2023-12-25T11:39:46Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Repairing Bugs in Python Assignments Using Large Language Models [9.973714032271708]
We propose to use a large language model trained on code to build an APR system for programming assignments.
Our system can fix both syntactic and semantic mistakes by combining multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking.
We evaluate MMAPR on 286 real student programs and compare to a baseline built by combining a state-of-the-art Python syntax repair engine, BIFI, and state-of-the-art Python semantic repair engine for student assignments, Refactory.
arXiv Detail & Related papers (2022-09-29T15:41:17Z) - Neurosymbolic Repair for Low-Code Formula Languages [12.986749944196402]
Most users of low-code platforms, such as Excel and PowerApps, write programs in domain-specific formula languages.
We develop LaMirage, a LAst-MIle RepAir-engine GEnerator that combines symbolic and neural techniques to perform last-mile repair.
We compare LaMirage to state-of-the-art neural and symbolic approaches on 400 real Excel and PowerFx formulas.
arXiv Detail & Related papers (2022-07-24T15:56:03Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Graph-based, Self-Supervised Program Repair from Diagnostic Feedback [108.48853808418725]
We introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback.
We then apply a graph neural network on top to model the reasoning process.
We present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online.
arXiv Detail & Related papers (2020-05-20T07:24:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.