Related papers: Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]

Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]

URL: http://arxiv.org/abs/1911.02142v3
Date: Thu, 27 Jun 2024 08:24:52 GMT
Title: Intriguing Properties of Adversarial ML Attacks in the Problem Space [Extended Version]
Authors: Jacopo Cortellazzi, Feargus Pendlebury, Daniel Arp, Erwin Quiring, Fabio Pierazzi, Lorenzo Cavallaro,
Abstract summary: We propose a general formalization for adversarial ML evasion attacks in the problem-space. We propose a novel problem-space attack on Android malware that overcomes past limitations in terms of semantics and artifacts. Our results demonstrate that "adversarial-malware as a service" is a realistic threat.
Score: 18.3238686304247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent research efforts on adversarial machine learning (ML) have investigated problem-space attacks, focusing on the generation of real evasive objects in domains where, unlike images, there is no clear inverse mapping to the feature space (e.g., software). However, the design, comparison, and real-world implications of problem-space attacks remain underexplored. This article makes three major contributions. Firstly, we propose a general formalization for adversarial ML evasion attacks in the problem-space, which includes the definition of a comprehensive set of constraints on available transformations, preserved semantics, absent artifacts, and plausibility. We shed light on the relationship between feature space and problem space, and we introduce the concept of side-effect features as the by-product of the inverse feature-mapping problem. This enables us to define and prove necessary and sufficient conditions for the existence of problem-space attacks. Secondly, building on our general formalization, we propose a novel problem-space attack on Android malware that overcomes past limitations in terms of semantics and artifacts. We have tested our approach on a dataset with 150K Android apps from 2016 and 2018 which show the practical feasibility of evading a state-of-the-art malware classifier along with its hardened version. Thirdly, we explore the effectiveness of adversarial training as a possible approach to enforce robustness against adversarial samples, evaluating its effectiveness on the considered machine learning models under different scenarios. Our results demonstrate that "adversarial-malware as a service" is a realistic threat, as we automatically generate thousands of realistic and inconspicuous adversarial applications at scale, where on average it takes only a few minutes to generate an adversarial instance.

Related papers

Improving Large Language Model Safety with Contrastive Representation Learning [92.79965952162298]
Large Language Models (LLMs) are powerful tools with profound societal impacts.<n>Their ability to generate responses to diverse and uncontrolled inputs leaves them vulnerable to adversarial attacks.<n>We propose a defense framework that formulates model defense as a contrastive representation learning problem.
arXiv Detail & Related papers (2025-06-13T16:42:09Z)
Tarallo: Evading Behavioral Malware Detectors in the Problem Space [4.654790185936138]
We show how an attacker can augment their chance of success by leveraging a new and more efficient feature space algorithm for sequential data.<n>We implement our novel algorithm and attack strategies in Tarallo, an end-to-end adversarial framework.
arXiv Detail & Related papers (2025-06-03T09:12:43Z)
Improving Adversarial Robustness in Android Malware Detection by Reducing the Impact of Spurious Correlations [3.7937308360299116]
Machine learning (ML) has demonstrated significant advancements in Android malware detection (AMD) However, the resilience of ML against realistic evasion attacks remains a major obstacle for AMD. In this study, we propose a domain adaptation technique to improve the generalizability of AMD by aligning the distribution of malware samples and AEs.
arXiv Detail & Related papers (2024-08-27T17:01:12Z)
How to Train your Antivirus: RL-based Hardening through the Problem-Space [22.056941223966255]
Adversarial training, the sole defensive technique that can confer empirical robustness, is not applicable out of the box in this domain. We introduce a novel Reinforcement Learning approach for constructing adversarial examples, a constituent part of adversarially training a model against evasion.
arXiv Detail & Related papers (2024-02-29T10:38:56Z)
To Make Yourself Invisible with Adversarial Semantic Contours [47.755808439588094]
Adversarial Semantic Contour (ASC) is an estimate of a Bayesian formulation of sparse attack with a deceived prior of object contour. We show that ASC can corrupt the prediction of 9 modern detectors with different architectures. We conclude with cautions about contour being the common weakness of object detectors with various architecture.
arXiv Detail & Related papers (2023-03-01T07:22:39Z)
Level Up with RealAEs: Leveraging Domain Constraints in Feature Space to Strengthen Robustness of Android Malware Detection [6.721598112028829]
A vulnerability to adversarial examples remains one major obstacle for Machine Learning (ML)-based Android malware detection. We propose to generate RealAEs in the feature space, leading to a simpler and more efficient solution. Our approach is driven by a novel interpretation of Android domain constraints in the feature space.
arXiv Detail & Related papers (2022-05-30T14:21:16Z)
On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks. This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z)
A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space [13.096022606256973]
We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that research can exploit.
arXiv Detail & Related papers (2021-12-02T12:05:27Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Universal Adversarial Perturbations for Malware [15.748648955898528]
Universal Adversarial Perturbations (UAPs) identify noisy patterns that generalize across the input space. We explore the challenges and strengths of UAPs in the context of malware classification. We propose adversarial training-based mitigations using knowledge derived from the problem-space transformations.
arXiv Detail & Related papers (2021-02-12T20:06:10Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
Spatiotemporal Attacks for Embodied Agents [119.43832001301041]
We take the first step to study adversarial attacks for embodied agents. In particular, we generate adversarial examples, which exploit the interaction history in both the temporal and spatial dimensions. Our perturbations have strong attack and generalization abilities.
arXiv Detail & Related papers (2020-05-19T01:38:47Z)
On Adversarial Examples and Stealth Attacks in Artificial Intelligence Systems [62.997667081978825]
We present a formal framework for assessing and analyzing two classes of malevolent action towards generic Artificial Intelligence (AI) systems. The first class involves adversarial examples and concerns the introduction of small perturbations of the input data that cause misclassification. The second class, introduced here for the first time and named stealth attacks, involves small perturbations to the AI system itself.
arXiv Detail & Related papers (2020-04-09T10:56:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.