WIA-SZZ: Work Item Aware SZZ
- URL: http://arxiv.org/abs/2411.12740v1
- Date: Tue, 19 Nov 2024 18:59:14 GMT
- Title: WIA-SZZ: Work Item Aware SZZ
- Authors: Salomé Perez-Rosero, Robert Dyer, Samuel W. Flint, Shane McIntosh, Witawas Srisa-an,
- Abstract summary: Existing SZZ algorithms identify the potential commit that induced a bug when given a fixing commit as input.
We build a new variant of SZZ that leverages our work item detecting commits to first suggest bug-inducing commits.
Our evaluation reveals 64% is accurate in finding work items, but most importantly it is able to find many bug-inducing commits.
- Score: 3.7232697932311645
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Many software engineering maintenance tasks require linking a commit that induced a bug with the commit that later fixed that bug. Several existing SZZ algorithms provide a way to identify the potential commit that induced a bug when given a fixing commit as input. Prior work introduced the notion of a "work item", a logical grouping of commits that could be a single unit of work. Our key insight in this work is to recognize that a bug-inducing commit and the fix(es) for that bug together represent a "work item." It is not currently understood how these work items, which are logical groups of revisions addressing a single issue or feature, could impact the performance of algorithms such as SZZ. In this paper, we propose a heuristic that, given an input commit, uses information about changed methods to identify related commits that form a work item with the input commit. We hypothesize that given such a work item identifying heuristic, we can identify bug-inducing commits more accurately than existing SZZ approaches. We then build a new variant of SZZ that we call Work Item Aware SZZ (WIA-SZZ), that leverages our work item detecting heuristic to first suggest bug-inducing commits. If our heuristic fails to find any candidates, we then fall back to baseline variants of SZZ. We conduct a manual evaluation to assess the accuracy of our heuristic to identify work items. Our evaluation reveals the heuristic is 64% accurate in finding work items, but most importantly it is able to find many bug-inducing commits. We then evaluate our approach on 821 repositories that have been previously used to study the performance of SZZ, comparing our work against six SZZ variants. That evaluation shows an improvement in F1 scores ranging from 2% to 9%, or when looking only at the subset of cases that found work item improved 3% to 14%.
Related papers
- LLM4SZZ: Enhancing SZZ Algorithm with Context-Enhanced Assessment on Large Language Models [10.525352489242398]
The SZZ algorithm is the dominant technique for identifying bug-inducing commits.
It serves as a foundation for many software engineering studies, such as bug prediction and static code analysis.
Recently, a deep learning-based SZZ algorithm has been introduced to enhance the original SZZ algorithm.
arXiv Detail & Related papers (2025-04-02T06:40:57Z) - Detecting Functional Bugs in Smart Contracts through LLM-Powered and Bug-Oriented Composite Analysis [34.8337182669106]
We design and implement PROMFUZZ, an automated and scalable system to detect functional bugs in smart contracts.
We first propose a novel Large Language Model (LLM)-driven analysis framework, which leverages a dual-agent prompt engineering strategy.
Finally, we design a bug-oriented fuzzing engine, which maps the logical information from the high-level business model to the low-level smart contract implementations.
arXiv Detail & Related papers (2025-03-31T04:39:51Z) - Localizing Task Information for Improved Model Merging and Compression [61.16012721460561]
We show that the information required to solve each task is still preserved after merging as different tasks mostly use non-overlapping sets of weights.
We propose Consensus Merging, an algorithm that eliminates such weights and improves the general performance of existing model merging approaches.
arXiv Detail & Related papers (2024-05-13T14:54:37Z) - Evaluating SZZ Implementations: An Empirical Study on the Linux Kernel [8.698309437598944]
The evaluation of how ghost commits impact the SZZ algorithm remains limited.
Linux kernel developers have started labelling bug-fixing patches with the commit identifiers of the corresponding bug-inducing commit(s) as a standard practice.
In this paper, we apply six SZZ algorithms to 76,046 pairs of bug-fixing patches and bug-inducing commits from the Linux kernel.
arXiv Detail & Related papers (2023-08-09T16:41:27Z) - Fuzzing with Quantitative and Adaptive Hot-Bytes Identification [6.442499249981947]
American fuzzy lop, a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs.
We propose an approach called toolwhich is designed based on the following principles.
Our evaluation results on 10 real-world programs and LAVA-M dataset show that toolachieves sustained increases in branch coverage and discovers more bugs than other fuzzers.
arXiv Detail & Related papers (2023-07-05T13:41:35Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - ADPTriage: Approximate Dynamic Programming for Bug Triage [0.0]
We develop a Markov decision process (MDP) model for an online bug triage task.
We provide an ADP-based bug triage solution, called ADPTriage, which reflects downstream uncertainty in the bug arrivals and developers' timetables.
Our result shows a significant improvement over the myopic approach in terms of assignment accuracy and fixing time.
arXiv Detail & Related papers (2022-11-02T04:42:21Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - Learning Stable Classifiers by Transferring Unstable Features [59.06169363181417]
We study transfer learning in the presence of spurious correlations.
We experimentally demonstrate that directly transferring the stable feature extractor learned on the source task may not eliminate these biases for the target task.
We hypothesize that the unstable features in the source task and those in the target task are directly related.
arXiv Detail & Related papers (2021-06-15T02:41:12Z) - Generating Bug-Fixes Using Pretrained Transformers [11.012132897417592]
We introduce a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub.
We show that pretraining on source code programs improves the number of patches found by 33% as compared to supervised training from scratch.
We refine the standard accuracy evaluation metric into non-deletion and deletion-only fixes, and show that our best model generates 75% more non-deletion fixes than the previous state of the art.
arXiv Detail & Related papers (2021-04-16T05:27:04Z) - Automated Mapping of Vulnerability Advisories onto their Fix Commits in
Open Source Repositories [7.629717457706326]
We present an approach that combines practical experience and machine-learning (ML)
An advisory record containing key information about a vulnerability is extracted from an advisory.
A subset of candidate fix commits is obtained from the source code repository of the affected project.
arXiv Detail & Related papers (2021-03-24T17:50:35Z) - Anchor-Free Person Search [127.88668724345195]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
Most existing works employ two-stage detectors like Faster-RCNN, yielding encouraging accuracy but with high computational overhead.
We present the Feature-Aligned Person Search Network (AlignPS), the first anchor-free framework to efficiently tackle this challenging task.
arXiv Detail & Related papers (2021-03-22T07:04:29Z) - IReEn: Reverse-Engineering of Black-Box Functions via Iterative Neural
Program Synthesis [70.61283188380689]
We investigate the problem of revealing the functionality of a black-box agent.
We do not rely on privileged information on the black box, but rather investigate the problem under a weaker assumption of having only access to inputs and outputs of the program.
Our results show that the proposed approach outperforms the state-of-the-art on this challenge by finding an approximately functional equivalent program in 78% of cases.
arXiv Detail & Related papers (2020-06-18T17:50:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.