Imbalanced malware classification: an approach based on dynamic classifier selection
- URL: http://arxiv.org/abs/2504.00041v2
- Date: Sat, 05 Apr 2025 19:40:14 GMT
- Title: Imbalanced malware classification: an approach based on dynamic classifier selection
- Authors: J. V. S. Souza, C. B. Vieira, G. D. C. Cavalcanti, R. M. O. Cruz,
- Abstract summary: A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat.<n>This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, the rise of cyber threats has emphasized the need for robust malware detection systems, especially on mobile devices. Malware, which targets vulnerabilities in devices and user data, represents a substantial security risk. A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat. This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications. We assess monolithic classifiers and ensemble methods, focusing on dynamic selection algorithms, which have shown superior performance compared to traditional approaches. In contrast to balancing strategies performed on the whole dataset, we propose a balancing procedure that works individually for each classifier in the pool. Our empirical analysis demonstrates that the KNOP algorithm obtained the best results using a pool of Random Forest. Additionally, an instance hardness assessment revealed that balancing reduces the difficulty of the minority class and enhances the detection of the minority class (malware). The code used for the experiments is available at https://github.com/jvss2/Machine-Learning-Empirical-Evaluation.
Related papers
- A Survey of Malware Detection Using Deep Learning [6.349503549199403]
This paper investigates advances in malware detection on Windows, iOS, Android, and Linux using deep learning (DL)
We discuss the issues and the challenges in malware detection using DL classifiers.
We examine eight popular DL approaches on various datasets.
arXiv Detail & Related papers (2024-07-27T02:49:55Z) - Multi-agent Reinforcement Learning-based Network Intrusion Detection System [3.4636217357968904]
Intrusion Detection Systems (IDS) play a crucial role in ensuring the security of computer networks.
We propose a novel multi-agent reinforcement learning (RL) architecture, enabling automatic, efficient, and robust network intrusion detection.
Our solution introduces a resilient architecture designed to accommodate the addition of new attacks and effectively adapt to changes in existing attack patterns.
arXiv Detail & Related papers (2024-07-08T09:18:59Z) - Explainability-Informed Targeted Malware Misclassification [0.0]
Machine learning models for malware classification into categories have shown promising results.
Deep neural networks have shown vulnerabilities against intentionally crafted adversarial attacks.
Our paper explores such adversarial vulnerabilities of neural network based malware classification system.
arXiv Detail & Related papers (2024-05-07T04:59:19Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - Android Malware Detection with Unbiased Confidence Guarantees [1.6432632226868131]
We propose a machine learning dynamic analysis approach that provides provably valid confidence guarantees in each malware detection.
The proposed approach is based on a novel machine learning framework, called Conformal Prediction, combined with a random forests classifier.
We examine its performance on a large-scale dataset collected by installing 1866 malicious and 4816 benign applications on a real android device.
arXiv Detail & Related papers (2023-12-17T11:07:31Z) - Class-Incremental Learning: A Survey [84.30083092434938]
Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally.
CIL tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades.
We provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms.
arXiv Detail & Related papers (2023-02-07T17:59:05Z) - Deep Image: A precious image based deep learning method for online
malware detection in IoT Environment [12.558284943901613]
In this paper, a different view of malware analysis is considered and the risk level of each sample feature is computed.
In addition to the usual machine learning criteria namely accuracy and FPR, a proposed criterion based on the risk of samples has also been used for comparison.
The results show that the deep learning approach performed better in detecting malware.
arXiv Detail & Related papers (2022-04-04T17:56:55Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Anomaly Detection in Cybersecurity: Unsupervised, Graph-Based and
Supervised Learning Methods in Adversarial Environments [63.942632088208505]
Inherent to today's operating environment is the practice of adversarial machine learning.
In this work, we examine the feasibility of unsupervised learning and graph-based methods for anomaly detection.
We incorporate a realistic adversarial training mechanism when training our supervised models to enable strong classification performance in adversarial environments.
arXiv Detail & Related papers (2021-05-14T10:05:10Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.