Revisiting Evolutionary Program Repair via Code Language Model
- URL: http://arxiv.org/abs/2408.10486v2
- Date: Mon, 26 Aug 2024 08:01:49 GMT
- Title: Revisiting Evolutionary Program Repair via Code Language Model
- Authors: Yunan Wang, Tingyu Guo, Zilong Huang, Yuan Yuan,
- Abstract summary: This paper introduces ARJA-CLM, which integrates the multiobjective evolutionary algorithm with CLM to fix multilocation bugs in Java projects.
We also propose a context-aware prompt construction stratege, which enriches the prompt with additional information about accessible fields and methods for the CLM generating candidate statements.
- Score: 11.711739409758476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software defects are an inherent part of software development and maintenance. To address these defects, Automated Program Repair (APR) has been developed to fix bugs automatically. With the advent of Large Language Models, Code Language Models (CLMs) trained on code corpora excels in code generation, making them suitable for APR applications. Despite this progress, a significant limitation remains: many bugs necessitate multi-point edits for repair, yet current CLM-based APRs are restricted to single-point bug fixes, which severely narrows the scope of repairable bugs. Moreover, these tools typically only consider the direct context of the buggy line when building prompts for the CLM, leading to suboptimal repair outcomes due to the limited information provided. This paper introduces a novel approach, ARJA-CLM, which integrates the multiobjective evolutionary algorithm with CLM to fix multilocation bugs in Java projects. We also propose a context-aware prompt construction stratege, which enriches the prompt with additional information about accessible fields and methods for the CLM generating candidate statements. Our experiments on the Defects4J and APR-2024 competition benchmark demonstrate that ARJA-CLM surpasses many state-of-the-art repair systems, and performs well on multi-point bugs. The results also reveal that CLMs effectively utilize the provided field and method information within context-aware prompts to produce candidate statements.
Related papers
- FastFixer: An Efficient and Effective Approach for Repairing Programming Assignments [21.848112758958543]
We propose FastFixer, an efficient and effective approach for programming assignment repair.
We first propose a novel repair-oriented fine-tuning strategy, aiming to enhance the LLM's attention towards learning how to generate the necessary patch and its associated context.
Considering the repair efficiency, FastFixer achieves a remarkable inference speedup of 16.67 times compared to the autoregressive decoding algorithm.
arXiv Detail & Related papers (2024-10-11T10:17:02Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR)
In this study, we take a deep dive into automated bug fixing utilizing LLMs.
This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information.
Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z) - An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications [5.395614997568524]
This paper presents a systematic investigation into the capacity of Large Language Models (LLMs) for repairing declarative specifications in Alloy.
We propose a novel repair pipeline that integrates a dual-agent LLM framework, comprising a Repair Agent and a Prompt Agent.
Our study reveals that LLMs, particularly GPT-4 variants, outperform existing techniques in terms of repair efficacy, albeit with a marginal increase in runtime and token usage.
arXiv Detail & Related papers (2024-04-17T03:46:38Z) - CodeEditorBench: Evaluating Code Editing Capability of Large Language Models [49.387195629660994]
Large Language Models (LLMs) for code are rapidly evolving, with code editing emerging as a critical capability.
We introduce CodeEditorBench, an evaluation framework designed to rigorously assess the performance of LLMs in code editing tasks.
We curate diverse coding challenges and scenarios from five sources, covering various programming languages, complexity levels, and editing tasks.
arXiv Detail & Related papers (2024-04-04T15:49:49Z) - Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments [26.236420215606238]
We develop a framework called PaR that is powered by the Large Language Model.
PaR works in three phases: Peer Solution Selection, Multi-Source Prompt Generation, and Program Repair.
The evaluation on Defects4DS and another well-investigated ITSP dataset reveals that PaR achieves a new state-of-the-art performance.
arXiv Detail & Related papers (2024-04-02T09:12:21Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Automatically Correcting Large Language Models: Surveying the landscape
of diverse self-correction strategies [104.32199881187607]
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output.
This paper presents a comprehensive review of this emerging class of techniques.
arXiv Detail & Related papers (2023-08-06T18:38:52Z) - Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs.
We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal.
Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z) - Conversational Automated Program Repair [10.071615423169902]
We propose a new paradigm for program repair that alternates between patch generation and validation in a conversational manner.
We leverage the long-term context window of Large Pre-Trained Language Models to not only avoid generating previously incorrect patches but also incorporate validation feedback to help the model understand the semantic meaning of the program under test.
arXiv Detail & Related papers (2023-01-30T19:22:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.