Do Gradient-based Explanations Tell Anything About Adversarial
Robustness to Android Malware?
- URL: http://arxiv.org/abs/2005.01452v2
- Date: Thu, 27 May 2021 15:58:04 GMT
- Title: Do Gradient-based Explanations Tell Anything About Adversarial
Robustness to Android Malware?
- Authors: Marco Melis, Michele Scalas, Ambra Demontis, Davide Maiorca, Battista
Biggio, Giorgio Giacinto, Fabio Roli
- Abstract summary: We investigate whether gradient-based attribution methods can be used to help identify and select more robust algorithms.
Experiments conducted on two different datasets and five classification algorithms for Android malware detection show that a strong connection exists between the uniformity of explanations and adversarial robustness.
- Score: 20.11888851905904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While machine-learning algorithms have demonstrated a strong ability in
detecting Android malware, they can be evaded by sparse evasion attacks crafted
by injecting a small set of fake components, e.g., permissions and system
calls, without compromising intrusive functionality. Previous work has shown
that, to improve robustness against such attacks, learning algorithms should
avoid overemphasizing few discriminant features, providing instead decisions
that rely upon a large subset of components. In this work, we investigate
whether gradient-based attribution methods, used to explain classifiers'
decisions by identifying the most relevant features, can be used to help
identify and select more robust algorithms. To this end, we propose to exploit
two different metrics that represent the evenness of explanations, and a new
compact security measure called Adversarial Robustness Metric. Our experiments
conducted on two different datasets and five classification algorithms for
Android malware detection show that a strong connection exists between the
uniformity of explanations and adversarial robustness. In particular, we found
that popular techniques like Gradient*Input and Integrated Gradients are
strongly correlated to security when applied to both linear and nonlinear
detectors, while more elementary explanation techniques like the simple
Gradient do not provide reliable information about the robustness of such
classifiers.
Related papers
- Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.
Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.
We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - A Robust Defense against Adversarial Attacks on Deep Learning-based
Malware Detectors via (De)Randomized Smoothing [4.97719149179179]
We propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing.
In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes.
arXiv Detail & Related papers (2024-02-23T11:30:12Z) - Malicious code detection in android: the role of sequence characteristics and disassembling methods [0.0]
We investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers.
Our findings exhibit that the disassembly method and different input representations affect the model results.
arXiv Detail & Related papers (2023-12-02T11:55:05Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - False Sense of Security: Leveraging XAI to Analyze the Reasoning and
True Performance of Context-less DGA Classifiers [1.930852251165745]
Domain Generation Algorithm (DGA) detection seems to be solved, considering that available deep learning classifiers achieve accuracies of over 99.9%.
These classifiers provide a false sense of security as they are heavily biased and allow for trivial detection bypass.
In this work, we leverage explainable artificial intelligence (XAI) methods to analyze the reasoning of deep learning classifiers.
arXiv Detail & Related papers (2023-07-10T06:05:23Z) - Spatial-Frequency Discriminability for Revealing Adversarial Perturbations [53.279716307171604]
Vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community.
Current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data.
We propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition.
arXiv Detail & Related papers (2023-05-18T10:18:59Z) - The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge
Detector [70.43599299422813]
Existing methods fuse multiple annotations using a simple voting process, ignoring the inherent ambiguity of edges and labeling bias of annotators.
We propose a novel uncertainty-aware edge detector (UAED), which employs uncertainty to investigate the subjectivity and ambiguity of diverse annotations.
UAED achieves superior performance consistently across multiple edge detection benchmarks.
arXiv Detail & Related papers (2023-03-21T13:14:36Z) - A two-steps approach to improve the performance of Android malware
detectors [4.440024971751226]
We propose GUIDED RETRAINING, a supervised representation learning-based method that boosts the performance of a malware detector.
We validate our method on four state-of-the-art Android malware detection approaches using over 265k malware and benign apps.
Our method is generic and designed to enhance the classification performance on a binary classification task.
arXiv Detail & Related papers (2022-05-17T12:04:17Z) - RamBoAttack: A Robust Query Efficient Deep Neural Network Decision
Exploit [9.93052896330371]
We develop a robust query efficient attack capable of avoiding entrapment in a local minimum and misdirection from noisy gradients.
The RamBoAttack is more robust to the different sample inputs available to an adversary and the targeted class.
arXiv Detail & Related papers (2021-12-10T01:25:24Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.