Related papers: Do Gradient-based Explanations Tell Anything About Adversarial Robustness to Android Malware?

Do Gradient-based Explanations Tell Anything About Adversarial Robustness to Android Malware?

URL: http://arxiv.org/abs/2005.01452v2
Date: Thu, 27 May 2021 15:58:04 GMT
Title: Do Gradient-based Explanations Tell Anything About Adversarial Robustness to Android Malware?
Authors: Marco Melis, Michele Scalas, Ambra Demontis, Davide Maiorca, Battista Biggio, Giorgio Giacinto, Fabio Roli
Abstract summary: We investigate whether gradient-based attribution methods can be used to help identify and select more robust algorithms. Experiments conducted on two different datasets and five classification algorithms for Android malware detection show that a strong connection exists between the uniformity of explanations and adversarial robustness.
Score: 20.11888851905904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While machine-learning algorithms have demonstrated a strong ability in detecting Android malware, they can be evaded by sparse evasion attacks crafted by injecting a small set of fake components, e.g., permissions and system calls, without compromising intrusive functionality. Previous work has shown that, to improve robustness against such attacks, learning algorithms should avoid overemphasizing few discriminant features, providing instead decisions that rely upon a large subset of components. In this work, we investigate whether gradient-based attribution methods, used to explain classifiers' decisions by identifying the most relevant features, can be used to help identify and select more robust algorithms. To this end, we propose to exploit two different metrics that represent the evenness of explanations, and a new compact security measure called Adversarial Robustness Metric. Our experiments conducted on two different datasets and five classification algorithms for Android malware detection show that a strong connection exists between the uniformity of explanations and adversarial robustness. In particular, we found that popular techniques like Gradient*Input and Integrated Gradients are strongly correlated to security when applied to both linear and nonlinear detectors, while more elementary explanation techniques like the simple Gradient do not provide reliable information about the robustness of such classifiers.

Related papers

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers [6.080064619880841]
Shortcut learning, i.e., a model's reliance on undesired features not directly relevant to the task, is a major challenge that severely limits the applications of machine learning algorithms. We leverage recent advancements in machine learning to create an unsupervised framework that is capable of both detecting and mitigating shortcut learning in transformers.
arXiv Detail & Related papers (2025-01-01T19:52:19Z)
Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes. Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes. We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z)
MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware. We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph. This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z)
A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing [4.97719149179179]
We propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing. In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes.
arXiv Detail & Related papers (2024-02-23T11:30:12Z)
Malicious code detection in android: the role of sequence characteristics and disassembling methods [0.0]
We investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers. Our findings exhibit that the disassembly method and different input representations affect the model results.
arXiv Detail & Related papers (2023-12-02T11:55:05Z)
Token-Level Adversarial Prompt Detection Based on Perplexity Measures and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks. This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs. We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z)
False Sense of Security: Leveraging XAI to Analyze the Reasoning and True Performance of Context-less DGA Classifiers [1.930852251165745]
Domain Generation Algorithm (DGA) detection seems to be solved, considering that available deep learning classifiers achieve accuracies of over 99.9%. These classifiers provide a false sense of security as they are heavily biased and allow for trivial detection bypass. In this work, we leverage explainable artificial intelligence (XAI) methods to analyze the reasoning of deep learning classifiers.
arXiv Detail & Related papers (2023-07-10T06:05:23Z)
Spatial-Frequency Discriminability for Revealing Adversarial Perturbations [53.279716307171604]
Vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community. Current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data. We propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition.
arXiv Detail & Related papers (2023-05-18T10:18:59Z)
The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector [70.43599299422813]
Existing methods fuse multiple annotations using a simple voting process, ignoring the inherent ambiguity of edges and labeling bias of annotators. We propose a novel uncertainty-aware edge detector (UAED), which employs uncertainty to investigate the subjectivity and ambiguity of diverse annotations. UAED achieves superior performance consistently across multiple edge detection benchmarks.
arXiv Detail & Related papers (2023-03-21T13:14:36Z)
A two-steps approach to improve the performance of Android malware detectors [4.440024971751226]
We propose GUIDED RETRAINING, a supervised representation learning-based method that boosts the performance of a malware detector. We validate our method on four state-of-the-art Android malware detection approaches using over 265k malware and benign apps. Our method is generic and designed to enhance the classification performance on a binary classification task.
arXiv Detail & Related papers (2022-05-17T12:04:17Z)
RamBoAttack: A Robust Query Efficient Deep Neural Network Decision Exploit [9.93052896330371]
We develop a robust query efficient attack capable of avoiding entrapment in a local minimum and misdirection from noisy gradients. The RamBoAttack is more robust to the different sample inputs available to an adversary and the targeted class.
arXiv Detail & Related papers (2021-12-10T01:25:24Z)
Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance. We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations. Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.