Explaining Software Bugs Leveraging Code Structures in Neural Machine
Translation
- URL: http://arxiv.org/abs/2212.04584v4
- Date: Tue, 25 Jul 2023 05:58:33 GMT
- Title: Explaining Software Bugs Leveraging Code Structures in Neural Machine
Translation
- Authors: Parvez Mahbub, Ohiduzzaman Shuvo, Mohammad Masudur Rahman
- Abstract summary: Bugsplainer generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits.
Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard.
We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.
- Score: 5.079750706023254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Software bugs claim approximately 50% of development time and cost the global
economy billions of dollars. Once a bug is reported, the assigned developer
attempts to identify and understand the source code responsible for the bug and
then corrects the code. Over the last five decades, there has been significant
research on automatically finding or correcting software bugs. However, there
has been little research on automatically explaining the bugs to the
developers, which is essential but a highly challenging task. In this paper, we
propose Bugsplainer, a transformer-based generative model, that generates
natural language explanations for software bugs by learning from a large corpus
of bug-fix commits. Bugsplainer can leverage structural information and buggy
patterns from the source code to generate an explanation for a bug. Our
evaluation using three performance metrics shows that Bugsplainer can generate
understandable and good explanations according to Google's standard, and can
outperform multiple baselines from the literature. We also conduct a developer
study involving 20 participants where the explanations from Bugsplainer were
found to be more accurate, more precise, more concise and more useful than the
baselines.
Related papers
- Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - What is a "bug"? On subjectivity, epistemic power, and implications for
software research [8.116831482130555]
"Bug" has been a colloquialism for an engineering "defect" at least since the 1870s.
Most modern software-oriented definitions speak to a disconnect between what a developer intended and what a program actually does.
"Finding bugs is easy" begins by saying "bug patterns are code that are often errors"
arXiv Detail & Related papers (2024-02-13T01:52:42Z) - DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs)
It covers four major bug categories and 18 minor types in C++, Java, and Python.
We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z) - Automated Bug Generation in the era of Large Language Models [6.0770779409377775]
BugFarm transforms arbitrary code into multiple complex bugs.
A comprehensive evaluation of 435k+ bugs from over 1.9M mutants generated by BUGFARM.
arXiv Detail & Related papers (2023-10-03T20:01:51Z) - Bugsplainer: Leveraging Code Structures to Explain Software Bugs with
Neural Machine Translation [4.519754139322585]
Bugsplainer generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits.
Bugsplainer leverages code structures to reason about a bug and employs the fine-tuned version of a text generation model, CodeT5.
arXiv Detail & Related papers (2023-08-23T17:35:16Z) - Large Language Models of Code Fail at Completing Code with Potential
Bugs [30.80172644795715]
We study the buggy-code completion problem inspired by real-time code suggestion.
We find that the presence of potential bugs significantly degrades the generation performance of the high-performing Code-LLMs.
arXiv Detail & Related papers (2023-06-06T06:35:27Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code.
We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Advaita: Bug Duplicity Detection System [1.9624064951902522]
Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity, size of the code and number of engineers working on the project.
Detecting duplicity deals with identifying whether any two bugs convey the same meaning.
This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features.
arXiv Detail & Related papers (2020-01-24T04:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.