The Dark Side of AutoML: Towards Architectural Backdoor Search
- URL: http://arxiv.org/abs/2210.12179v1
- Date: Fri, 21 Oct 2022 18:13:23 GMT
- Title: The Dark Side of AutoML: Towards Architectural Backdoor Search
- Authors: Ren Pang, Changjiang Li, Zhaohan Xi, Shouling Ji, Ting Wang
- Abstract summary: EVAS is a new attack that leverages NAS to find neural architectures with inherent backdoors and exploits such vulnerability using input-aware triggers.
EVAS features high evasiveness, transferability, and robustness, thereby expanding the adversary's design spectrum.
This work raises concerns about the current practice of NAS and points to potential directions to develop effective countermeasures.
- Score: 49.16544351888333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper asks the intriguing question: is it possible to exploit neural
architecture search (NAS) as a new attack vector to launch previously
improbable attacks? Specifically, we present EVAS, a new attack that leverages
NAS to find neural architectures with inherent backdoors and exploits such
vulnerability using input-aware triggers. Compared with existing attacks, EVAS
demonstrates many interesting properties: (i) it does not require polluting
training data or perturbing model parameters; (ii) it is agnostic to downstream
fine-tuning or even re-training from scratch; (iii) it naturally evades
defenses that rely on inspecting model parameters or training data. With
extensive evaluation on benchmark datasets, we show that EVAS features high
evasiveness, transferability, and robustness, thereby expanding the adversary's
design spectrum. We further characterize the mechanisms underlying EVAS, which
are possibly explainable by architecture-level ``shortcuts'' that recognize
trigger patterns. This work raises concerns about the current practice of NAS
and points to potential directions to develop effective countermeasures.
Related papers
- Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor [0.24335447922683692]
We introduce a new type of backdoor attack that conceals itself within the underlying model architecture.
The add-on modules of model architecture layers can detect the presence of input trigger tokens and modify layer weights.
We conduct extensive experiments to evaluate our attack methods using two model architecture settings on five different large language datasets.
arXiv Detail & Related papers (2024-09-03T14:54:16Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Few-Shot Backdoor Attacks on Visual Object Tracking [80.13936562708426]
Visual object tracking (VOT) has been widely adopted in mission-critical applications, such as autonomous driving and intelligent surveillance systems.
We show that an adversary can easily implant hidden backdoors into VOT models by tempering with the training process.
We show that our attack is resistant to potential defenses, highlighting the vulnerability of VOT models to potential backdoor attacks.
arXiv Detail & Related papers (2022-01-31T12:38:58Z) - Reconstructing Training Data with Informed Adversaries [30.138217209991826]
Given access to a machine learning model, can an adversary reconstruct the model's training data?
This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one.
We show it is feasible to reconstruct the remaining data point in this stringent threat model.
arXiv Detail & Related papers (2022-01-13T09:19:25Z) - On the Security Risks of AutoML [38.03918108363182]
Neural Architecture Search (NAS) is an emerging machine learning paradigm that automatically searches for models tailored to given tasks.
We show that compared with their manually designed counterparts, NAS-generated models tend to suffer greater vulnerability to various malicious attacks.
We discuss potential remedies to mitigate such drawbacks, including increasing cell depth and suppressing skip connects.
arXiv Detail & Related papers (2021-10-12T14:04:15Z) - The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems.
In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.