Related papers: MarsCode Agent: AI-native Automated Bug Fixing

MarsCode Agent: AI-native Automated Bug Fixing

URL: http://arxiv.org/abs/2409.00899v2
Date: Wed, 4 Sep 2024 06:19:08 GMT
Title: MarsCode Agent: AI-native Automated Bug Fixing
Authors: Yizhou Liu, Pengfei Gao, Xinchen Wang, Jie Liu, Yexuan Shi, Zhao Zhang, Chao Peng,
Abstract summary: We introduce MarsCode Agent, a novel framework that leverages large language models to automatically identify and repair bugs in software code. Our approach follows a systematic process of planning, bug reproduction, fault localization, candidate patch generation, and validation to ensure high-quality bug fixes. Our results show that MarsCode Agent achieves a high success rate in bug fixing compared to most of the existing automated approaches.
Score: 7.909344108948294
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in large language models (LLMs) have shown significant potential to automate various software development tasks, including code completion, test generation, and bug fixing. However, the application of LLMs for automated bug fixing remains challenging due to the complexity and diversity of real-world software systems. In this paper, we introduce MarsCode Agent, a novel framework that leverages LLMs to automatically identify and repair bugs in software code. MarsCode Agent combines the power of LLMs with advanced code analysis techniques to accurately localize faults and generate patches. Our approach follows a systematic process of planning, bug reproduction, fault localization, candidate patch generation, and validation to ensure high-quality bug fixes. We evaluated MarsCode Agent on SWE-bench, a comprehensive benchmark of real-world software projects, and our results show that MarsCode Agent achieves a high success rate in bug fixing compared to most of the existing automated approaches.

Related papers

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive. We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons. The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
PATCH: Empowering Large Language Model with Programmer-Intent Guidance and Collaborative-Behavior Simulation for Automatic Bug Fixing [34.768989900184636]
Bug fixing holds significant importance in software development and maintenance. Recent research has made substantial strides in exploring the potential of large language models (LLMs) for automatically resolving software bugs.
arXiv Detail & Related papers (2025-01-27T15:43:04Z)
LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [62.12404317786005]
EvoCoder is a continuous learning framework for issue code reproduction. Our results show a 20% improvement in issue reproduction rates over existing SOTA methods.
arXiv Detail & Related papers (2024-11-21T08:49:23Z)
An Empirical Study on LLM-based Agents for Automated Bug Fixing [2.433168823911037]
Large language models (LLMs) and LLM-based Agents have been applied to fix bugs automatically. We examine seven proprietary and open-source systems on the SWE-bench Lite benchmark for automated bug fixing.
arXiv Detail & Related papers (2024-11-15T14:19:15Z)
A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation [0.0]
27 recent papers have been reviewed and split into two groups. The first group consists of new methods for bug detection and repair, which include locating semantic errors. The second group dwells on code generation, providing an overview of both general-purpose LLMs fine-tuned for programming and task-specific models. It also presents methods to improve code generation, such as identifier-aware training, fine-tuning at the instruction level, and incorporating semantic code structures.
arXiv Detail & Related papers (2024-11-12T06:47:54Z)
Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Scenarios [13.949319911378826]
This study evaluated 4,892 patches from 10 top-ranked agents on 500 real-world GitHub issues. No single agent dominated, with 170 issues unresolved, indicating room for improvement. Most agents maintained code reliability and security, avoiding new bugs or vulnerabilities. Some agents increased code complexity, many reduced code duplication and minimized code smells.
arXiv Detail & Related papers (2024-10-16T11:33:57Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
Leveraging Large Language Models for Efficient Failure Analysis in Game Development [47.618236610219554]
This paper proposes a new approach to automatically identify which change in the code caused a test to fail. The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure. Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year.
arXiv Detail & Related papers (2024-06-11T09:21:50Z)
Automatic Programming: Large Language Models and Beyond [48.34544922560503]
We study concerns around code quality, security and related issues of programmer responsibility. We discuss how advances in software engineering can enable automatic programming. We conclude with a forward looking view, focusing on the programming environment of the near future.
arXiv Detail & Related papers (2024-05-03T16:19:24Z)
A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR) In this study, we take a deep dive into automated bug fixing utilizing LLMs. This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information. Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z)
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back. Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair. This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z)
BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization. We provide a general benchmark with a diversity of real and synthetic Java bugs. We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.