Evaluating the Robustness of Adversarial Defenses in Malware Detection Systems
- URL: http://arxiv.org/abs/2505.09342v1
- Date: Wed, 14 May 2025 12:38:43 GMT
- Title: Evaluating the Robustness of Adversarial Defenses in Malware Detection Systems
- Authors: Mostafa Jafari, Alireza Shameli-Sendi,
- Abstract summary: We introduce a technique to convert continuous perturbations into binary feature spaces while preserving high attack success and low perturbation size.<n>Second, we present a novel adversarial method for binary domains, designed to achieve attack goals with minimal feature changes.<n> Experiments on the Malscan dataset show that sigma-binary outperforms existing attacks and exposes key vulnerabilities in state-of-the-art defenses.
- Score: 2.209921757303168
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning is a key tool for Android malware detection, effectively identifying malicious patterns in apps. However, ML-based detectors are vulnerable to evasion attacks, where small, crafted changes bypass detection. Despite progress in adversarial defenses, the lack of comprehensive evaluation frameworks in binary-constrained domains limits understanding of their robustness. We introduce two key contributions. First, Prioritized Binary Rounding, a technique to convert continuous perturbations into binary feature spaces while preserving high attack success and low perturbation size. Second, the sigma-binary attack, a novel adversarial method for binary domains, designed to achieve attack goals with minimal feature changes. Experiments on the Malscan dataset show that sigma-binary outperforms existing attacks and exposes key vulnerabilities in state-of-the-art defenses. Defenses equipped with adversary detectors, such as KDE, DLA, DNN+, and ICNN, exhibit significant brittleness, with attack success rates exceeding 90% using fewer than 10 feature modifications and reaching 100% with just 20. Adversarially trained defenses, including AT-rFGSM-k, AT-MaxMA, improves robustness under small budgets but remains vulnerable to unrestricted perturbations, with attack success rates of 99.45% and 96.62%, respectively. Although PAD-SMA demonstrates strong robustness against state-of-the-art gradient-based adversarial attacks by maintaining an attack success rate below 16.55%, the sigma-binary attack significantly outperforms these methods, achieving a 94.56% success rate under unrestricted perturbations. These findings highlight the critical need for precise method like sigma-binary to expose hidden vulnerabilities in existing defenses and support the development of more resilient malware detection systems.
Related papers
- Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis [3.795071937009966]
Adrial attacks can jeopardize the integrity of Machine Learning (ML) models.<n>We propose a framework that detects if an adversarial noise instance is being generated.<n>We evaluate our approach against 8 state-of-the-art attacks, including adaptive attacks.
arXiv Detail & Related papers (2025-03-04T20:25:12Z) - RoMA: Robust Malware Attribution via Byte-level Adversarial Training with Global Perturbations and Adversarial Consistency Regularization [17.387354788421742]
APT adversaries often conceal their identities, rendering attribution inherently adversarial.<n>Existing machine learning-based attribution models, while effective, remain highly vulnerable to adversarial attacks.<n>We propose RoMA, a novel single-step adversarial training approach that integrates global perturbations to generate enhanced adversarial samples.
arXiv Detail & Related papers (2025-02-11T11:51:12Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection [19.885698402507145]
Adversarial examples can exploit vulnerabilities within deep neural networks.
This study showcases the susceptibility of deep learning models to adversarial attacks, which can achieve 100% attack success rate.
arXiv Detail & Related papers (2024-07-27T09:04:54Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks.
We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness.
Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z) - MalPurifier: Enhancing Android Malware Detection with Adversarial Purification against Evasion Attacks [18.016148305499865]
MalPurifier is a novel adversarial purification framework specifically engineered for Android malware detection.<n>Experiments on two large-scale datasets demonstrate that MalPurifier significantly outperforms state-of-the-art defenses.<n>As a lightweight, model-agnostic, and plug-and-play module, MalPurifier offers a practical and effective solution to bolster the security of ML-based Android malware detectors.
arXiv Detail & Related papers (2023-12-11T14:48:43Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - PAD: Towards Principled Adversarial Malware Detection Against Evasion
Attacks [17.783849474913726]
We propose a new adversarial training framework, termed Principled Adversarial Malware Detection (PAD)
PAD lays on a learnable convex measurement that quantifies distribution-wise discrete perturbations to protect malware detectors from adversaries.
PAD can harden ML-based malware detection against 27 evasion attacks with detection accuracies greater than 83.45%.
It matches or outperforms many anti-malware scanners in VirusTotal against realistic adversarial malware.
arXiv Detail & Related papers (2023-02-22T12:24:49Z) - RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers
via Randomized Deletion [23.309600117618025]
We adapt randomized smoothing for discrete sequence classifiers to provide certified robustness against edit distance-bounded adversaries.
Our proof of certification deviates from the established Neyman-Pearson approach, which is intractable in our setting, and is instead organized around longest common subsequences.
When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
arXiv Detail & Related papers (2023-01-31T01:40:26Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.