A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction
- URL: http://arxiv.org/abs/2410.10046v1
- Date: Sun, 13 Oct 2024 23:39:04 GMT
- Title: A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction
- Authors: Jie Zhang, Dongcheng Li, W. Eric Wong, Shengrong Wang,
- Abstract summary: This paper introduces a novel SDP framework that integrates hybrid sampling techniques, with a suite of multi-objective optimization algorithms.
The proposed model applies feature fusion through multi-objective optimization, enhancing both the generalization capability and stability of the predictions.
Experiments conducted on datasets from NASA and PROMISE repositories demonstrate that the proposed hybrid sampling and multi-objective optimization approach improves data balance, eliminates redundant features, and enhances prediction accuracy.
- Score: 3.407555189785573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate early prediction of software defects is essential to maintain software quality and reduce maintenance costs. However, the field of software defect prediction (SDP) faces challenges such as class imbalances, high-dimensional feature spaces, and suboptimal prediction accuracy. To mitigate these challenges, this paper introduces a novel SDP framework that integrates hybrid sampling techniques, specifically Borderline SMOTE and Tomek Links, with a suite of multi-objective optimization algorithms, including NSGA-II, MOPSO, and MODE. The proposed model applies feature fusion through multi-objective optimization, enhancing both the generalization capability and stability of the predictions. Furthermore, the integration of parallel processing for these optimization algorithms significantly boosts the computational efficiency of the model. Comprehensive experiments conducted on datasets from NASA and PROMISE repositories demonstrate that the proposed hybrid sampling and multi-objective optimization approach improves data balance, eliminates redundant features, and enhances prediction accuracy. The experimental results also highlight the robustness of the feature fusion approach, confirming its superiority over existing state-of-the-art techniques in terms of predictive performance and applicability across diverse datasets.
Related papers
- QGAPHEnsemble : Combining Hybrid QLSTM Network Ensemble via Adaptive Weighting for Short Term Weather Forecasting [0.0]
This research highlights the practical efficacy of employing advanced machine learning techniques.
Our model demonstrates a substantial improvement in the accuracy and reliability of meteorological predictions.
The paper highlights the importance of optimized ensemble techniques to improve the performance the given weather forecasting task.
arXiv Detail & Related papers (2025-01-18T20:18:48Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.
deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.
This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - SDPERL: A Framework for Software Defect Prediction Using Ensemble Feature Extraction and Reinforcement Learning [0.0]
This paper proposes an innovative framework for software defect prediction.
It combines ensemble feature extraction with reinforcement learning (RL)--based feature selection.
We claim that this work is among the first in recent efforts to address this challenge at the file-level granularity.
arXiv Detail & Related papers (2024-12-10T21:16:05Z) - Beyond Single-Model Views for Deep Learning: Optimization versus
Generalizability of Stochastic Optimization Algorithms [13.134564730161983]
This paper adopts a novel approach to deep learning optimization, focusing on gradient descent (SGD) and its variants.
We show that SGD and its variants demonstrate performance on par with flat-minimas like SAM, albeit with half the gradient evaluations.
Our study uncovers several key findings regarding the relationship between training loss and hold-out accuracy, as well as the comparable performance of SGD and noise-enabled variants.
arXiv Detail & Related papers (2024-03-01T14:55:22Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z) - Bilevel Optimization for Differentially Private Optimization in Energy
Systems [53.806512366696275]
This paper studies how to apply differential privacy to constrained optimization problems whose inputs are sensitive.
The paper shows that, under a natural assumption, a bilevel model can be solved efficiently for large-scale nonlinear optimization problems.
arXiv Detail & Related papers (2020-01-26T20:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.