Related papers: The Repeat Offenders: Characterizing and Predicting Extremely Bug-Prone Source Methods

The Repeat Offenders: Characterizing and Predicting Extremely Bug-Prone Source Methods

URL: http://arxiv.org/abs/2511.22726v1
Date: Thu, 27 Nov 2025 19:13:24 GMT
Title: The Repeat Offenders: Characterizing and Predicting Extremely Bug-Prone Source Methods
Authors: Ethan Friesen, Sasha Morton-Salmon, Md Nahidul Islam Opu, Shahidul Islam, Shaiful Chowdhury,
Abstract summary: ExtremelyBuggy methods constitute only a tiny fraction of all methods, yet frequently account for a disproportionately large share of bugs.<n>These methods are significantly larger, more complex, less readable, and less maintainable than both singly-buggy and non-buggy methods.
Score: 0.8481798330936976
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Identifying the small subset of source code that repeatedly attracts bugs is critical for reducing long-term maintenance effort. We define ExtremelyBuggy methods as those involved in more than one bug fix and present the first large-scale study of their prevalence, characteristics, and predictability. Using a dataset of over 1.25 million methods from 98 open-source Java projects, we find that ExtremelyBuggy methods constitute only a tiny fraction of all methods, yet frequently account for a disproportionately large share of bugs. At their inception, these methods are significantly larger, more complex, less readable, and less maintainable than both singly-buggy and non-buggy methods. However, despite these measurable differences, a comprehensive evaluation of five machine learning models shows that early prediction of ExtremelyBuggy methods remains highly unreliable due to data imbalance, project heterogeneity, and the fact that many bugs emerge through subsequent evolution rather than initial implementation. To complement these quantitative findings, we conduct a thematic analysis of 265 ExtremelyBuggy methods, revealing recurring visual issues (e.g., confusing control flow, poor readability), contextual roles (e.g., core logic, data transformation, external resource handling), and common defect patterns (e.g., faulty conditionals, fragile error handling, misuse of variables). These results highlight the need for richer, evolution-aware representations of code and provide actionable insights for practitioners seeking to prioritize high-risk methods early in the development lifecycle.

Related papers

BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills [59.003563837981886]
High quality bugs are key to training the next generation of language model based software engineering (SWE) agents.<n>We introduce a novel method for synthetic generation of difficult and diverse bugs.
arXiv Detail & Related papers (2025-10-22T17:58:56Z)
Graspness Discovery in Clutters for Fast and Accurate Grasp Detection [57.81325062171676]
"graspness" is a quality based on geometry cues that distinguishes graspable areas in cluttered scenes. We develop a neural network named cascaded graspness model to approximate the searching process. Experiments on a large-scale benchmark, GraspNet-1Billion, show that our method outperforms previous arts by a large margin.
arXiv Detail & Related papers (2024-06-17T02:06:47Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [89.92916473403108]
This paper proposes a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.<n>The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.<n>We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models [102.72940700598055]
In reasoning tasks, even a minor error can cascade into inaccurate results. We develop a method that avoids introducing external resources, relying instead on perturbations to the input. Our training approach randomly masks certain tokens within the chain of thought, a technique we found to be particularly effective for reasoning tasks.
arXiv Detail & Related papers (2024-03-04T16:21:54Z)
A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging [8.831760500324318]
We offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging. DeBERTa is the most effective technique across the triaging tasks of developer and component assignment.
arXiv Detail & Related papers (2023-10-10T18:09:32Z)
ADPTriage: Approximate Dynamic Programming for Bug Triage [0.0]
We develop a Markov decision process (MDP) model for an online bug triage task. We provide an ADP-based bug triage solution, called ADPTriage, which reflects downstream uncertainty in the bug arrivals and developers' timetables. Our result shows a significant improvement over the myopic approach in terms of assignment accuracy and fixing time.
arXiv Detail & Related papers (2022-11-02T04:42:21Z)
Infrared: A Meta Bug Detector [10.541969253100815]
We propose a new approach, called meta bug detection, which offers three crucial advantages over existing learning-based bug detectors. Our evaluation shows our meta bug detector (MBD) is effective in catching a variety of bugs including null pointer dereference, array index out-of-bound, file handle leak, and even data races in concurrent programs.
arXiv Detail & Related papers (2022-09-18T09:08:51Z)
An Empirical Study on Bug Severity Estimation using Source Code Metrics and Static Analysis [0.8621608193534838]
We study 3,358 buggy methods with different severity labels from 19 Java open-source projects. Results show that code metrics are useful in predicting buggy code, but they cannot estimate the severity level of the bugs. Our categorization shows that Security bugs have high severity in most cases while Edge/Boundary faults have low severity.
arXiv Detail & Related papers (2022-06-26T17:07:23Z)
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets. We define a uniform evaluation setup including a new formalization of the annotation error detection task. We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z)
DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network. To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.