LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm
- URL: http://arxiv.org/abs/2411.19075v1
- Date: Thu, 28 Nov 2024 11:50:23 GMT
- Title: LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm
- Authors: Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang, Georgios Smaragdakis,
- Abstract summary: This work proposes a multiobjective blackbox backdoor attack in dual domains via evolutionary algorithm (LADDER)
In particular, we formulate LADDER as a multi-objective optimization problem (MOP) and solve it via multi-objective evolutionary algorithm (MOEA)
Experiments comprehensively showcase that LADDER attack effectiveness of at least 99%, attack robustness with 90.23%, superior natural stealthiness (1.12x to 196.74x improvement) and excellent spectral stealthiness (8.45x enhancement) as compared to current stealthy attacks by the average $l$-norm across 5 public datasets.
- Score: 11.95174457001938
- License:
- Abstract: Current black-box backdoor attacks in convolutional neural networks formulate attack objective(s) as single-objective optimization problems in single domain. Designing triggers in single domain harms semantics and trigger robustness as well as introduces visual and spectral anomaly. This work proposes a multi-objective black-box backdoor attack in dual domains via evolutionary algorithm (LADDER), the first instance of achieving multiple attack objectives simultaneously by optimizing triggers without requiring prior knowledge about victim model. In particular, we formulate LADDER as a multi-objective optimization problem (MOP) and solve it via multi-objective evolutionary algorithm (MOEA). MOEA maintains a population of triggers with trade-offs among attack objectives and uses non-dominated sort to drive triggers toward optimal solutions. We further apply preference-based selection to MOEA to exclude impractical triggers. We state that LADDER investigates a new dual-domain perspective for trigger stealthiness by minimizing the anomaly between clean and poisoned samples in the spectral domain. Lastly, the robustness against preprocessing operations is achieved by pushing triggers to low-frequency regions. Extensive experiments comprehensively showcase that LADDER achieves attack effectiveness of at least 99%, attack robustness with 90.23% (50.09% higher than state-of-the-art attacks on average), superior natural stealthiness (1.12x to 196.74x improvement) and excellent spectral stealthiness (8.45x enhancement) as compared to current stealthy attacks by the average $l_2$-norm across 5 public datasets.
Related papers
- Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Multi-objective Evolutionary Search of Variable-length Composite
Semantic Perturbations [1.9100854225243937]
We propose a novel method called multi-objective evolutionary search of variable-length composite semantic perturbations (MES-VCSP)
MES-VCSP can obtain adversarial examples with a higher attack success rate, more naturalness, and less time cost.
arXiv Detail & Related papers (2023-07-13T04:08:16Z) - A Large-scale Multiple-objective Method for Black-box Attack against
Object Detection [70.00150794625053]
We propose to minimize the true positive rate and maximize the false positive rate, which can encourage more false positive objects to block the generation of new true positive bounding boxes.
We extend the standard Genetic Algorithm with Random Subset selection and Divide-and-Conquer, called GARSDC, which significantly improves the efficiency.
Compared with the state-of-art attack methods, GARSDC decreases by an average 12.0 in the mAP and queries by about 1000 times in extensive experiments.
arXiv Detail & Related papers (2022-09-16T08:36:42Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Constrained Gradient Descent: A Powerful and Principled Evasion Attack
Against Neural Networks [19.443306494201334]
We introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal.
First, we propose a new loss function that explicitly captures the goal of targeted attacks.
Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the $L_infty$ distance limit.
arXiv Detail & Related papers (2021-12-28T17:36:58Z) - Adversarial Unlearning of Backdoors via Implicit Hypergradient [13.496838121707754]
We propose a minimax formulation for removing backdoors from a poisoned model based on a small set of clean data.
We use the Implicit Bacdoor Adversarial Unlearning (I-BAU) algorithm to solve the minimax.
I-BAU's performance is comparable to and most often significantly better than the best baseline.
arXiv Detail & Related papers (2021-10-07T18:32:54Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.