STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic
Bug Fixing
- URL: http://arxiv.org/abs/2308.14460v1
- Date: Mon, 28 Aug 2023 09:56:12 GMT
- Title: STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic
Bug Fixing
- Authors: Yuwei Zhang and Zhi Jin and Ying Xing and Ge Li
- Abstract summary: We introduce a novel stage-wise framework named STEAM to simulate the collaborative nature of bug resolution.
We decompose the bug fixing task into four distinct stages: bug reporting, bug diagnosis, patch generation, and patch verification.
Our evaluation on the widely adopted bug-fixing benchmark demonstrates that STEAM has achieved a new state-of-the-art level of bug-fixing performance.
- Score: 37.70518599085676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bug fixing holds significant importance in software development and
maintenance. Recent research has made notable progress in exploring the
potential of large language models (LLMs) for automatic bug fixing. However,
existing studies often overlook the collaborative nature of bug resolution,
treating it as a single-stage process. To overcome this limitation, we
introduce a novel stage-wise framework named STEAM in this paper. The objective
of STEAM is to simulate the interactive behavior of multiple programmers
involved in various stages across the bug's life cycle. Taking inspiration from
bug management practices, we decompose the bug fixing task into four distinct
stages: bug reporting, bug diagnosis, patch generation, and patch verification.
These stages are performed interactively by LLMs, aiming to imitate the
collaborative abilities of programmers during the resolution of software bugs.
By harnessing the collective contribution, STEAM effectively enhances the
bug-fixing capabilities of LLMs. We implement STEAM by employing the powerful
dialogue-based LLM -- ChatGPT. Our evaluation on the widely adopted bug-fixing
benchmark demonstrates that STEAM has achieved a new state-of-the-art level of
bug-fixing performance.
Related papers
- PATCH: Empowering Large Language Model with Programmer-Intent Guidance and Collaborative-Behavior Simulation for Automatic Bug Fixing [34.768989900184636]
Bug fixing holds significant importance in software development and maintenance.
Recent research has made substantial strides in exploring the potential of large language models (LLMs) for automatically resolving software bugs.
arXiv Detail & Related papers (2025-01-27T15:43:04Z) - LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [62.12404317786005]
EvoCoder is a continuous learning framework for issue code reproduction.
Our results show a 20% improvement in issue reproduction rates over existing SOTA methods.
arXiv Detail & Related papers (2024-11-21T08:49:23Z) - MarsCode Agent: AI-native Automated Bug Fixing [7.909344108948294]
We introduce MarsCode Agent, a novel framework that leverages large language models to automatically identify and repair bugs in software code.
Our approach follows a systematic process of planning, bug reproduction, fault localization, candidate patch generation, and validation to ensure high-quality bug fixes.
Our results show that MarsCode Agent achieves a high success rate in bug fixing compared to most of the existing automated approaches.
arXiv Detail & Related papers (2024-09-02T02:24:38Z) - A Deep Dive into Large Language Models for Automated Bug Localization and Repair [12.756202755547024]
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR)
In this study, we take a deep dive into automated bug fixing utilizing LLMs.
This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information.
Toggle achieves the new state-of-the-art (SOTA) performance on the CodeXGLUE code refinement benchmark.
arXiv Detail & Related papers (2024-04-17T17:48:18Z) - DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs)
It covers four major bug categories and 18 minor types in C++, Java, and Python.
We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z) - Automated Bug Generation in the era of Large Language Models [6.0770779409377775]
BugFarm transforms arbitrary code into multiple complex bugs.
A comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BUGFARM.
arXiv Detail & Related papers (2023-10-03T20:01:51Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - ADPTriage: Approximate Dynamic Programming for Bug Triage [0.0]
We develop a Markov decision process (MDP) model for an online bug triage task.
We provide an ADP-based bug triage solution, called ADPTriage, which reflects downstream uncertainty in the bug arrivals and developers' timetables.
Our result shows a significant improvement over the myopic approach in terms of assignment accuracy and fixing time.
arXiv Detail & Related papers (2022-11-02T04:42:21Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.