Explanation-Guided Backdoor Poisoning Attacks Against Malware
Classifiers
- URL: http://arxiv.org/abs/2003.01031v3
- Date: Mon, 11 Jan 2021 00:14:51 GMT
- Title: Explanation-Guided Backdoor Poisoning Attacks Against Malware
Classifiers
- Authors: Giorgio Severi, Jim Meyer, Scott Coull, Alina Oprea
- Abstract summary: Training pipelines for machine learning based malware classification often rely on crowdsourced threat feeds.
This paper focuses on challenging "clean label" attacks where attackers do not control the sample labeling process.
We propose the use of techniques from explainable machine learning to guide the selection of relevant features and values to create effective backdoor triggers.
- Score: 12.78844634194129
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training pipelines for machine learning (ML) based malware classification
often rely on crowdsourced threat feeds, exposing a natural attack injection
point. In this paper, we study the susceptibility of feature-based ML malware
classifiers to backdoor poisoning attacks, specifically focusing on challenging
"clean label" attacks where attackers do not control the sample labeling
process. We propose the use of techniques from explainable machine learning to
guide the selection of relevant features and values to create effective
backdoor triggers in a model-agnostic fashion. Using multiple reference
datasets for malware classification, including Windows PE files, PDFs, and
Android applications, we demonstrate effective attacks against a diverse set of
machine learning models and evaluate the effect of various constraints imposed
on the attacker. To demonstrate the feasibility of our backdoor attacks in
practice, we create a watermarking utility for Windows PE files that preserves
the binary's functionality, and we leverage similar behavior-preserving
alteration methodologies for Android and PDF files. Finally, we experiment with
potential defensive strategies and show the difficulties of completely
defending against these attacks, especially when the attacks blend in with the
legitimate sample distribution.
Related papers
- MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Mitigating Label Flipping Attacks in Malicious URL Detectors Using
Ensemble Trees [16.16333915007336]
Malicious URLs provide adversarial opportunities across various industries, including transportation, healthcare, energy, and banking.
backdoor attacks involve manipulating a small percentage of training data labels, such as Label Flipping (LF), which changes benign labels to malicious ones and vice versa.
We propose an innovative alarm system that detects the presence of poisoned labels and a defense mechanism designed to uncover the original class labels.
arXiv Detail & Related papers (2024-03-05T14:21:57Z) - Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability.
We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Binary Black-box Evasion Attacks Against Deep Learning-based Static
Malware Detectors with Adversarial Byte-Level Language Model [11.701290164823142]
MalRNN is a novel approach to automatically generate evasive malware variants without restrictions.
MalRNN effectively evades three recent deep learning-based malware detectors and outperforms current benchmark methods.
arXiv Detail & Related papers (2020-12-14T22:54:53Z) - Being Single Has Benefits. Instance Poisoning to Deceive Malware
Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier.
As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger.
We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z) - MDEA: Malware Detection with Evolutionary Adversarial Learning [16.8615211682877]
MDEA, an Adversarial Malware Detection model uses evolutionary optimization to create attack samples to make the network robust against evasion attacks.
By retraining the model with the evolved malware samples, its performance improves a significant margin.
arXiv Detail & Related papers (2020-02-09T09:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.