Related papers: Predicting Line-Level Defects by Capturing Code Contexts with Hierarchical Transformers

Predicting Line-Level Defects by Capturing Code Contexts with Hierarchical Transformers

URL: http://arxiv.org/abs/2312.11889v1
Date: Tue, 19 Dec 2023 06:25:04 GMT
Title: Predicting Line-Level Defects by Capturing Code Contexts with Hierarchical Transformers
Authors: Parvez Mahbub and Mohammad Masudur Rahman
Abstract summary: Bugsplorer is a novel deep-learning technique for line-level defect prediction. It can rank the first 20% defective lines within the top 1-3% suspicious lines. It has the potential to significantly reduce SQA costs by ranking defective lines higher.
Score: 4.73194777046253
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Software defects consume 40% of the total budget in software development and cost the global economy billions of dollars every year. Unfortunately, despite the use of many software quality assurance (SQA) practices in software development (e.g., code review, continuous integration), defects may still exist in the official release of a software product. Therefore, prioritizing SQA efforts for the vulnerable areas of the codebase is essential to ensure the high quality of a software release. Predicting software defects at the line level could help prioritize the SQA effort but is a highly challenging task given that only ~3% of lines of a codebase could be defective. Existing works on line-level defect prediction often fall short and cannot fully leverage the line-level defect information. In this paper, we propose Bugsplorer, a novel deep-learning technique for line-level defect prediction. It leverages a hierarchical structure of transformer models to represent two types of code elements: code tokens and code lines. Unlike the existing techniques that are optimized for file-level defect prediction, Bugsplorer is optimized for a line-level defect prediction objective. Our evaluation with five performance metrics shows that Bugsplorer has a promising capability of predicting defective lines with 26-72% better accuracy than that of the state-of-the-art technique. It can rank the first 20% defective lines within the top 1-3% suspicious lines. Thus, Bugsplorer has the potential to significantly reduce SQA costs by ranking defective lines higher.

Related papers

Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points [51.40935517552926]
We introduce Focused-DPO, a framework that enhances code generation by directing preference optimization towards critical error-prone areas. By focusing on error-prone points, Focused-DPO advances the accuracy and functionality of model-generated code.
arXiv Detail & Related papers (2025-02-17T06:16:02Z)
Cracks in The Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets [5.0091559832205155]
We propose an automated source code autocuration technique to improve the quality of training data. We evaluate this method using The Stack v2 dataset, and find that 17% of the code versions in the dataset have newer versions. We expect our results to inspire process improvements for automated data curation, with the potential to enhance the reliability of outputs generated by AI tools.
arXiv Detail & Related papers (2025-01-05T18:54:25Z)
Code Revert Prediction with Graph Neural Networks: A Case Study at J.P. Morgan Chase [10.961209762486684]
Code revert prediction aims to forecast or predict the likelihood of code changes being reverted or rolled back in software development. Previous methods for code defect detection relied on independent features but ignored relationships between code scripts. This paper presents a systematic empirical study for code revert prediction that integrates the code import graph with code features.
arXiv Detail & Related papers (2024-03-14T15:54:29Z)
BAFLineDP: Code Bilinear Attention Fusion Framework for Line-Level Defect Prediction [0.0]
This paper presents a line-level defect prediction method grounded in a code bilinear attention fusion framework (BAFLineDP) Our results demonstrate that BAFLineDP outperforms current advanced file-level and line-level defect prediction approaches.
arXiv Detail & Related papers (2024-02-11T09:01:42Z)
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen) RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs. We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z)
PrAIoritize: Automated Early Prediction and Prioritization of Vulnerabilities in Smart Contracts [1.081463830315253]
Smart contracts are prone to numerous security threats due to undisclosed vulnerabilities and code weaknesses. Efficient prioritization is crucial for smart contract security. Our research aims to provide an automated approach, PrAIoritize, for prioritizing and predicting critical code weaknesses.
arXiv Detail & Related papers (2023-08-21T23:30:39Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
A Universal Error Measure for Input Predictions Applied to Online Graph Problems [57.58926849872494]
We introduce a novel measure for quantifying the error in input predictions. The measure captures errors due to absent predicted requests as well as unpredicted actual requests.
arXiv Detail & Related papers (2022-05-25T15:24:03Z)
HEDP: A Method for Early Forecasting Software Defects based on Human Error Mechanisms [1.0660480034605238]
The main process behind a software defect is that an error-prone scenario triggers human error modes. The proposed idea emphasizes predicting the exact location and form of a possible defect.
arXiv Detail & Related papers (2021-10-13T14:44:23Z)
Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation. Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges. Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
Machine Learning Techniques for Software Quality Assurance: A Survey [5.33024001730262]
We discuss various approaches in both fault prediction and test case prioritization. Recent studies deep learning algorithms for fault prediction help to bridge the gap between programs' semantics and fault prediction features.
arXiv Detail & Related papers (2021-04-29T00:37:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.