Enhancing All-to-X Backdoor Attacks with Optimized Target Class Mapping
- URL: http://arxiv.org/abs/2511.13356v1
- Date: Mon, 17 Nov 2025 13:22:44 GMT
- Title: Enhancing All-to-X Backdoor Attacks with Optimized Target Class Mapping
- Authors: Lei Wang, Yulong Tian, Hao Han, Fengyuan Xu,
- Abstract summary: Backdoor attacks pose severe threats to machine learning systems.<n>Most existing work focuses on single-target All-to-One (A2O) attacks, overlooking the more complex All-to-X (A2X) attacks with multiple target classes.<n>We propose a novel attack strategy that enhances the success rate of A2X attacks while maintaining robustness.
- Score: 11.703299790086241
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attacks pose severe threats to machine learning systems, prompting extensive research in this area. However, most existing work focuses on single-target All-to-One (A2O) attacks, overlooking the more complex All-to-X (A2X) attacks with multiple target classes, which are often assumed to have low attack success rates. In this paper, we first demonstrate that A2X attacks are robust against state-of-the-art defenses. We then propose a novel attack strategy that enhances the success rate of A2X attacks while maintaining robustness by optimizing grouping and target class assignment mechanisms. Our method improves the attack success rate by up to 28%, with average improvements of 6.7%, 16.4%, 14.1% on CIFAR10, CIFAR100, and Tiny-ImageNet, respectively. We anticipate that this study will raise awareness of A2X attacks and stimulate further research in this under-explored area. Our code is available at https://github.com/kazefjj/A2X-backdoor .
Related papers
- The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections [74.60337113759313]
Current defenses against jailbreaks and prompt injections are typically evaluated against a static set of harmful attack strings.<n>We argue that this evaluation process is flawed. Instead, we should evaluate defenses against adaptive attackers who explicitly modify their attack strategy to counter a defense's design.
arXiv Detail & Related papers (2025-10-10T05:51:04Z) - Active Attacks: Red-teaming LLMs via Adaptive Environments [71.55110023234376]
We address the challenge of generating diverse attack prompts for large language models (LLMs)<n>We introduce textitActive Attacks, a novel RL-based red-teaming algorithm that adapts its attacks as the victim evolves.
arXiv Detail & Related papers (2025-09-26T06:27:00Z) - ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models [55.93380086403591]
Generative large language models are vulnerable to backdoor attacks.<n>$textitELBA-Bench$ allows attackers to inject backdoor through parameter efficient fine-tuning.<n>$textitELBA-Bench$ provides over 1300 experiments.
arXiv Detail & Related papers (2025-02-22T12:55:28Z) - Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning [9.441965281943132]
Federated Learning (FL), a privacy-preserving decentralized machine learning framework, has been shown to be vulnerable to backdoor attacks.<n>We propose Mirage, the first non-cooperative MBA strategy in FL that allows attackers to inject effective and persistent backdoors into the global model.<n>We show that Mirage outperforms various state-of-the-art attacks and bypasses existing defenses, achieving an average ASR greater than 97% and maintaining over 90% after 900 rounds.
arXiv Detail & Related papers (2024-09-29T07:37:22Z) - Protecting against simultaneous data poisoning attacks [14.893813906644153]
Current backdoor defense methods are evaluated against a single attack at a time.
We show that simultaneously executed data poisoning attacks can effectively install multiple backdoors in a single model.
We develop a new defense, BaDLoss, that is effective in the multi-attack setting.
arXiv Detail & Related papers (2024-08-23T16:57:27Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - PubDef: Defending Against Transfer Attacks From Public Models [6.0012551318569285]
We propose a new practical threat model where the adversary relies on transfer attacks through publicly available surrogate models.
We evaluate the transfer attacks in this setting and propose a specialized defense method based on a game-theoretic perspective.
Under this threat model, our defense, PubDef, outperforms the state-of-the-art white-box adversarial training by a large margin with almost no loss in the normal accuracy.
arXiv Detail & Related papers (2023-10-26T17:58:08Z) - MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models.
Our framework is able to model different levels of learner's knowledge about the test-time adversary.
We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z) - Towards Out-of-Distribution Adversarial Robustness [18.019850207961465]
We show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach.
We treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks.
Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.
arXiv Detail & Related papers (2022-10-06T18:23:10Z) - Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models.
In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms.
CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z) - Mitigating Advanced Adversarial Attacks with More Advanced Gradient
Obfuscation Techniques [13.972753012322126]
Deep Neural Networks (DNNs) are well-known to be vulnerable to Adversarial Examples (AEs)
Recently, advanced gradient-based attack techniques were proposed.
In this paper, we make a steady step towards mitigating those advanced gradient-based attacks.
arXiv Detail & Related papers (2020-05-27T23:42:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.