Predicting post-release defects with knowledge units (KUs) of programming languages: an empirical study
- URL: http://arxiv.org/abs/2412.02907v2
- Date: Mon, 03 Mar 2025 18:26:17 GMT
- Title: Predicting post-release defects with knowledge units (KUs) of programming languages: an empirical study
- Authors: Md Ahasanuzzaman, Gustavo A. Oliva, Ahmed E. Hassan, Zhen Ming, Jiang,
- Abstract summary: Defect prediction plays a crucial role in software engineering, enabling developers to identify defect-prone code and improve software quality.<n>To address this gap, we introduce Knowledge Units (KUs) of programming languages as a novel feature set for analyzing software systems and defect prediction.<n>A KU is a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language.
- Score: 25.96111422428881
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Defect prediction plays a crucial role in software engineering, enabling developers to identify defect-prone code and improve software quality. While extensive research has focused on refining machine learning models for defect prediction, the exploration of new data sources for feature engineering remains limited. Defect prediction models primarily rely on traditional metrics such as product, process, and code ownership metrics, which, while effective, do not capture language-specific traits that may influence defect proneness. To address this gap, we introduce Knowledge Units (KUs) of programming languages as a novel feature set for analyzing software systems and defect prediction. A KU is a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language. We conduct an empirical study leveraging 28 KUs that are derived from Java certification exams and compare their effectiveness against traditional metrics in predicting post-release defects across 8 well-maintained Java software systems. Our results show that KUs provide significant predictive power, achieving a median AUC of 0.82, outperforming individual group of traditional metric-based models. Among KU features, Method & Encapsulation, Inheritance, and Exception Handling emerge as the most influential predictors. Furthermore, combining KUs with traditional metrics enhances prediction performance, yielding a median AUC of 0.89. We also introduce a cost-effective model using only 10 features, which maintains strong predictive performance while reducing feature engineering costs. Our findings demonstrate the value of KUs in predicting post-release defects, offering a complementary perspective to traditional metrics. This study can be helpful to researchers who wish to analyze software systems from a perspective that is complementary to that of traditional metrics.
Related papers
- SDPERL: A Framework for Software Defect Prediction Using Ensemble Feature Extraction and Reinforcement Learning [0.0]
This paper proposes an innovative framework for software defect prediction.
It combines ensemble feature extraction with reinforcement learning (RL)--based feature selection.
We claim that this work is among the first in recent efforts to address this challenge at the file-level granularity.
arXiv Detail & Related papers (2024-12-10T21:16:05Z) - Feature Importance in the Context of Traditional and Just-In-Time Software Defect Prediction Models [5.1868909177638125]
This study developed defect prediction models incorporating the traditional and the Just-In-Time approaches from the publicly available dataset of the Apache Camel project.
A multi-layer deep learning algorithm was applied to these datasets in comparison with machine learning algorithms.
The deep learning algorithm achieved accuracies of 80% and 86%, with the area under receiving operator curve (AUC) scores of 66% and 78% for traditional and Just-In-Time defect prediction, respectively.
arXiv Detail & Related papers (2024-11-07T22:49:39Z) - Predicting long time contributors with knowledge units of programming languages: an empirical study [3.6840775431698893]
This paper reports an empirical study on the usage of knowledge units (KUs) of the Java programming language to predict LTCs.
A KU is a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language.
We build a prediction model called KULTC, which leverages KU-based features along five different dimensions.
arXiv Detail & Related papers (2024-05-22T17:28:06Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models [102.72940700598055]
In reasoning tasks, even a minor error can cascade into inaccurate results.
We develop a method that avoids introducing external resources, relying instead on perturbations to the input.
Our training approach randomly masks certain tokens within the chain of thought, a technique we found to be particularly effective for reasoning tasks.
arXiv Detail & Related papers (2024-03-04T16:21:54Z) - Method-Level Bug Severity Prediction using Source Code Metrics and LLMs [0.628122931748758]
We investigate source code metrics, source code representation using large language models (LLMs), and their combination in predicting bug severity labels.
Our results suggest that Decision Tree and Random Forest models outperform other models regarding our several evaluation metrics.
CodeBERT finetuning improves the bug severity prediction results significantly in the range of 29%-140% for several evaluation metrics.
arXiv Detail & Related papers (2023-09-06T14:38:07Z) - Explainable Software Defect Prediction from Cross Company Project
Metrics Using Machine Learning [5.829545587965401]
This study focuses on developing defect prediction models that apply various machine learning algorithms.
One notable issue in existing defect prediction studies is the lack of transparency in the developed models.
arXiv Detail & Related papers (2023-06-14T17:46:08Z) - KGA: A General Machine Unlearning Framework Based on Knowledge Gap
Alignment [51.15802100354848]
We propose a general unlearning framework called KGA to induce forgetfulness.
Experiments on large-scale datasets show that KGA yields comprehensive improvements over baselines.
arXiv Detail & Related papers (2023-05-11T02:44:29Z) - PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user.
We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z) - Cross Version Defect Prediction with Class Dependency Embeddings [17.110933073074584]
We use the Class Dependency Network (CDN) as another predictor for defects, combined with static code metrics.
Our approach uses network embedding techniques to leverage CDN information without having to build the metrics manually.
arXiv Detail & Related papers (2022-12-29T18:24:39Z) - Defect Prediction Using Stylistic Metrics [2.286041284499166]
This paper aims at analyzing the impact of stylistic metrics on both within-project and crossproject defect prediction.
Experiment is conducted on 14 releases of 5 popular, open source projects.
arXiv Detail & Related papers (2022-06-22T10:11:05Z) - Great Truths are Always Simple: A Rather Simple Knowledge Encoder for
Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models [89.98762327725112]
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems.
For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models(PTMs) with a knowledge-aware graph neural network(GNN) encoder.
Despite the effectiveness, these approaches are built on heavy architectures, and can't clearly explain how external knowledge resources improve the reasoning capacity of PTMs.
arXiv Detail & Related papers (2022-05-04T01:27:36Z) - Precise Learning of Source Code Contextual Semantics via Hierarchical
Dependence Structure and Graph Attention Networks [28.212889828892664]
We propose a novel source code model embedded with hierarchical dependencies.
We introduce the syntactic structural of the basic block, i.e., its corresponding AST, in source code model to provide sufficient information.
The results show that our model reduces the scale of parameters by 50% and achieves 4% improvement on accuracy on program classification task.
arXiv Detail & Related papers (2021-11-20T04:03:42Z) - Graph-Based Machine Learning Improves Just-in-Time Defect Prediction [0.38073142980732994]
We use graph-based machine learning to improve Just-In-Time (JIT) defect prediction.
We show that our best model can predict whether or not a code change will lead to a defect with an F1 score as high as 77.55%.
This represents a 152% higher F1 score and a 3% higher MCC over the state-of-the-art JIT defect prediction.
arXiv Detail & Related papers (2021-10-11T16:00:02Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - Task-Specific Normalization for Continual Learning of Blind Image
Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA)
The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability.
We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score.
The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z) - Federated Learning with Unreliable Clients: Performance Analysis and
Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients.
However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training.
We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z) - Machine Learning Techniques for Software Quality Assurance: A Survey [5.33024001730262]
We discuss various approaches in both fault prediction and test case prioritization.
Recent studies deep learning algorithms for fault prediction help to bridge the gap between programs' semantics and fault prediction features.
arXiv Detail & Related papers (2021-04-29T00:37:27Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Re-Assessing the "Classify and Count" Quantification Method [88.60021378715636]
"Classify and Count" (CC) is often a biased estimator.
Previous works have failed to use properly optimised versions of CC.
We argue that, while still inferior to some cutting-edge methods, they deliver near-state-of-the-art accuracy.
arXiv Detail & Related papers (2020-11-04T21:47:39Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.