Related papers: Software Defect Prediction Based On Deep Learning Models: Performance Study

Software Defect Prediction Based On Deep Learning Models: Performance Study

URL: http://arxiv.org/abs/2004.02589v1
Date: Thu, 2 Apr 2020 06:02:14 GMT
Title: Software Defect Prediction Based On Deep Learning Models: Performance Study
Authors: Ahmad Hasanpour, Pourya Farzi, Ali Tehrani, Reza Akbari
Abstract summary: Two deep learning models, Stack Sparse Auto-Encoder (SSAE) and Deep Belief Network (DBN) are deployed to classify NASA datasets. According to the conducted experiment, the accuracy for the datasets with sufficient samples is enhanced.
Score: 0.5735035463793008
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, defect prediction, one of the major software engineering problems, has been in the focus of researchers since it has a pivotal role in estimating software errors and faulty modules. Researchers with the goal of improving prediction accuracy have developed many models for software defect prediction. However, there are a number of critical conditions and theoretical problems in order to achieve better results. In this paper, two deep learning models, Stack Sparse Auto-Encoder (SSAE) and Deep Belief Network (DBN), are deployed to classify NASA datasets, which are unbalanced and have insufficient samples. According to the conducted experiment, the accuracy for the datasets with sufficient samples is enhanced and beside this SSAE model gains better results in comparison to DBN model in the majority of evaluation metrics.

Related papers

Feature Importance in the Context of Traditional and Just-In-Time Software Defect Prediction Models [5.1868909177638125]
This study developed defect prediction models incorporating the traditional and the Just-In-Time approaches from the publicly available dataset of the Apache Camel project. A multi-layer deep learning algorithm was applied to these datasets in comparison with machine learning algorithms. The deep learning algorithm achieved accuracies of 80% and 86%, with the area under receiving operator curve (AUC) scores of 66% and 78% for traditional and Just-In-Time defect prediction, respectively.
arXiv Detail & Related papers (2024-11-07T22:49:39Z)
An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data. In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models. We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z)
Three-Stage Adjusted Regression Forecasting (TSARF) for Software Defect Prediction [5.826476252191368]
Nonhomogeneous Poisson process (NHPP) SRGM are the most commonly employed models. Increased model complexity presents a challenge in identifying robust and computationally efficient algorithms.
arXiv Detail & Related papers (2024-01-31T02:19:35Z)
Towards Causal Deep Learning for Vulnerability Detection [31.59558109518435]
We introduce do calculus based causal learning to software engineering models. Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance.
arXiv Detail & Related papers (2023-10-12T00:51:06Z)
Explainable Software Defect Prediction from Cross Company Project Metrics Using Machine Learning [5.829545587965401]
This study focuses on developing defect prediction models that apply various machine learning algorithms. One notable issue in existing defect prediction studies is the lack of transparency in the developed models.
arXiv Detail & Related papers (2023-06-14T17:46:08Z)
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization. We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z)
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. We provide a language for describing how training data influences predictions, through a causal framework. Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples. We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models. We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.