Method-Level Bug Severity Prediction using Source Code Metrics and LLMs
- URL: http://arxiv.org/abs/2309.03044v1
- Date: Wed, 6 Sep 2023 14:38:07 GMT
- Title: Method-Level Bug Severity Prediction using Source Code Metrics and LLMs
- Authors: Ehsan Mashhadi, Hossein Ahmadvand, Hadi Hemmati
- Abstract summary: We investigate source code metrics, source code representation using large language models (LLMs), and their combination in predicting bug severity labels.
Our results suggest that Decision Tree and Random Forest models outperform other models regarding our several evaluation metrics.
CodeBERT finetuning improves the bug severity prediction results significantly in the range of 29%-140% for several evaluation metrics.
- Score: 0.628122931748758
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the past couple of decades, significant research efforts are devoted to
the prediction of software bugs. However, most existing work in this domain
treats all bugs the same, which is not the case in practice. It is important
for a defect prediction method to estimate the severity of the identified bugs
so that the higher-severity ones get immediate attention. In this study, we
investigate source code metrics, source code representation using large
language models (LLMs), and their combination in predicting bug severity labels
of two prominent datasets. We leverage several source metrics at method-level
granularity to train eight different machine-learning models. Our results
suggest that Decision Tree and Random Forest models outperform other models
regarding our several evaluation metrics. We then use the pre-trained CodeBERT
LLM to study the source code representations' effectiveness in predicting bug
severity. CodeBERT finetuning improves the bug severity prediction results
significantly in the range of 29%-140% for several evaluation metrics, compared
to the best classic prediction model on source code metric. Finally, we
integrate source code metrics into CodeBERT as an additional input, using our
two proposed architectures, which both enhance the CodeBERT model
effectiveness.
Related papers
- Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics [0.7099737083842057]
Bug prediction aims at finding source code elements in a software system that are likely to contain defects.
In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls.
arXiv Detail & Related papers (2024-05-12T10:31:43Z) - Pre-training Code Representation with Semantic Flow Graph for Effective
Bug Localization [4.159296619915587]
We propose a novel directed, multiple-label code graph representation named Semantic Flow Graph (SFG)
We show that our method achieves state-of-the-art performance in bug localization.
arXiv Detail & Related papers (2023-08-24T13:25:17Z) - A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained
Models [87.7086269902562]
We show that subword-based models might still be the most practical choice in many settings.
We encourage future work in tokenizer-free methods to consider these factors when designing and evaluating new models.
arXiv Detail & Related papers (2022-10-13T15:47:09Z) - Learning to predict test effectiveness [1.4213973379473652]
This article offers a machine learning model to predict the extent to which the test could cover a class in terms of a new metric called Coverageability.
We offer a mathematical model to evaluate test effectiveness in terms of size and coverage of the test suite generated automatically for each class.
arXiv Detail & Related papers (2022-08-20T07:26:59Z) - An Empirical Study on Bug Severity Estimation using Source Code Metrics and Static Analysis [0.8621608193534838]
We study 3,358 buggy methods with different severity labels from 19 Java open-source projects.
Results show that code metrics are useful in predicting buggy code, but they cannot estimate the severity level of the bugs.
Our categorization shows that Security bugs have high severity in most cases while Edge/Boundary faults have low severity.
arXiv Detail & Related papers (2022-06-26T17:07:23Z) - Understanding Factual Errors in Summarization: Errors, Summarizers,
Datasets, Error Detectors [105.12462629663757]
In this work, we aggregate factuality error annotations from nine existing datasets and stratify them according to the underlying summarization model.
We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.
arXiv Detail & Related papers (2022-05-25T15:26:48Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - Predicting the Number of Reported Bugs in a Software Repository [0.0]
We examine eight different time series forecasting models, including Long Short Term Memory Neural Networks (LSTM), auto-regressive integrated moving average (ARIMA), and Random Forest Regressor.
We analyze the quality of long-term prediction for each model based on different performance metrics.
The assessment is conducted on Mozilla, which is a large open-source software application.
arXiv Detail & Related papers (2021-04-24T19:06:35Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.