Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection
- URL: http://arxiv.org/abs/2403.02232v2
- Date: Mon, 25 Mar 2024 21:33:18 GMT
- Title: Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection
- Authors: Zhenglin Li, Haibei Zhu, Houze Liu, Jintong Song, Qishuo Cheng,
- Abstract summary: This study conducts a thorough examination of malware detection using machine learning techniques.
The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively.
- Score: 0.5475886285082937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study conducts a thorough examination of malware detection using machine learning techniques, focusing on the evaluation of various classification models using the Mal-API-2019 dataset. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively. Both ensemble and non-ensemble machine learning methods, such as Random Forest, XGBoost, K Nearest Neighbor (KNN), and Neural Networks, are explored. Special emphasis is placed on the importance of data pre-processing techniques, particularly TF-IDF representation and Principal Component Analysis, in improving model performance. Results indicate that ensemble methods, particularly Random Forest and XGBoost, exhibit superior accuracy, precision, and recall compared to others, highlighting their effectiveness in malware detection. The paper also discusses limitations and potential future directions, emphasizing the need for continuous adaptation to address the evolving nature of malware. This research contributes to ongoing discussions in cybersecurity and provides practical insights for developing more robust malware detection systems in the digital era.
Related papers
- Adversarial Challenges in Network Intrusion Detection Systems: Research Insights and Future Prospects [0.33554367023486936]
This paper provides a comprehensive review of machine learning-based Network Intrusion Detection Systems (NIDS)
We critically examine existing research in NIDS, highlighting key trends, strengths, and limitations.
We discuss emerging challenges in the field and offer insights for the development of more robust and resilient NIDS.
arXiv Detail & Related papers (2024-09-27T13:27:29Z) - Leveraging LSTM and GAN for Modern Malware Detection [0.4799822253865054]
This paper proposes the utilization of the Deep Learning Model, LSTM networks, and GAN classifiers to amplify malware detection accuracy and speed.
The research outcomes come out with 98% accuracy that shows the efficiency of deep learning plays a decisive role in proactive cybersecurity defense.
arXiv Detail & Related papers (2024-05-07T14:57:24Z) - Case Study: Neural Network Malware Detection Verification for Feature and Image Datasets [5.198311758274061]
We present a novel verification domain that will help to ensure tangible safeguards against adversaries.
We describe malware classification and two types of common malware datasets.
We outline the challenges and future considerations necessary for the improvement and refinement of the verification of malware classification.
arXiv Detail & Related papers (2024-04-08T17:37:22Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Enhancing Malware Detection by Integrating Machine Learning with Cuckoo
Sandbox [0.0]
This study aims to classify and identify malware extracted from a dataset containing API call sequences.
Both deep learning and machine learning algorithms achieve remarkably high levels of accuracy, reaching up to 99% in certain cases.
arXiv Detail & Related papers (2023-11-07T22:33:17Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Robustness Evaluation of Deep Unsupervised Learning Algorithms for
Intrusion Detection Systems [0.0]
This paper evaluates the robustness of six recent deep learning algorithms for intrusion detection on contaminated data.
Our experiments suggest that the state-of-the-art algorithms used in this study are sensitive to data contamination and reveal the importance of self-defense against data perturbation.
arXiv Detail & Related papers (2022-06-25T02:28:39Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.