DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning
- URL: http://arxiv.org/abs/2603.02849v1
- Date: Tue, 03 Mar 2026 10:49:46 GMT
- Title: DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning
- Authors: Jiayao Wang, Mohammad Maruf Hasan, Yiping Zhang, Xiaoying Lei, Jiale Zhang, Qilin Wu, Junwu Zhu, Dongfang Zhao,
- Abstract summary: Self-Supervised Learning (SSL) has emerged as a significant paradigm in representation learning thanks to its ability to learn without extensive labeled data.<n>Recent research reveals that SSL models are also vulnerable to backdoor attacks.<n>We propose a Dynamic Stealthy Backdoor Attack (DSBA) backed by a new technique we term Collaborative Optimization.
- Score: 10.286339414754496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-Supervised Learning (SSL) has emerged as a significant paradigm in representation learning thanks to its ability to learn without extensive labeled data, its strong generalization capabilities, and its potential for privacy preservation. However, recent research reveals that SSL models are also vulnerable to backdoor attacks. Existing backdoor attack methods in the SSL context commonly suffer from issues such as high detectability of triggers, feature entanglement, and pronounced out-of-distribution properties in poisoned samples, all of which compromises attack effectiveness and stealthiness. To that, we propose a Dynamic Stealthy Backdoor Attack (DSBA) backed by a new technique we term Collaborative Optimization. This method decouples the attack process into two collaborative optimization layers: the outer-layer optimization trains a backdoor encoder responsible for global feature space remodeling, aiming to achieve precise backdoor implantation while preserving core functionality; meanwhile, the inner-layer optimization employs a dynamically optimized generator to adaptively produce optimally concealed triggers for individual samples, achieving coordinated concealment across feature space and visual space. We also introduce multiple loss functions to dynamically balance attack performance and stealthiness, in which we employ an adaptive weight scheduling mechanism to enhance training stability. Extensive experiments on various mainstream SSL algorithms and five public datasets demonstrate that: (i) DSBA significantly enhances Attack Success Rate (ASR) and stealthiness while maintaining downstream task accuracy; and (ii) DSBA exhibits superior robustness against existing mainstream defense methods.
Related papers
- OmniVL-Guard: Towards Unified Vision-Language Forgery Detection and Grounding via Balanced RL [63.388513841293616]
Existing forgery detection methods fail to handle the interleaved text, images, and videos prevalent in real-world misinformation.<n>To bridge this gap, this paper targets to develop a unified framework for omnibus vision-language forgery detection and grounding.<n>We propose textbf OmniVL-Guard, a balanced reinforcement learning framework for omnibus vision-language forgery detection and grounding.
arXiv Detail & Related papers (2026-02-11T09:41:36Z) - ADCA: Attention-Driven Multi-Party Collusion Attack in Federated Self-Supervised Learning [9.410118086518992]
Federated Self-Supervised Learning (FSSL) integrates privacy advantages of distributed training with the capability of self-supervised learning.<n>Recent studies have shown that FSSL is also vulnerable to backdoor attacks.<n>We propose the Attention-Driven multi-party Collusion Attack (ADCA)
arXiv Detail & Related papers (2026-02-05T12:49:36Z) - HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning [11.615563669883072]
Federated self-supervised learning (FSSL) enables collaborative training of self-supervised representation models without sharing raw unlabeled data.<n>While it serves as a crucial paradigm for privacy-preserving learning, its security remains vulnerable to backdoor attacks.<n>We propose a new backdoor attack method for FSSL, namely Hallucinated Positive Entanglement (HPE)
arXiv Detail & Related papers (2026-02-02T14:24:06Z) - Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization [51.12422886183246]
Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks.<n>Existing works tend to focus on either isolated jailbreak attacks or static defenses, neglecting the dynamic interplay between evolving threats and safeguards in real-world web contexts.<n>We propose ACE-Safety, a novel framework that jointly optimize attack and defense models by seamlessly integrating two key innovative procedures.
arXiv Detail & Related papers (2025-11-24T15:23:41Z) - DOPA: Stealthy and Generalizable Backdoor Attacks from a Single Client under Challenging Federated Constraints [2.139012072214621]
Federated Learning (FL) is increasingly adopted for privacy-preserving collaborative training, but its decentralized nature makes it susceptible to backdoor attacks.<n>Existing attack methods, however, often rely on idealized assumptions and fail to remain effective under real-world constraints.<n>We propose DOPA, a novel framework that simulates heterogeneous local training dynamics and seeks consensus across divergent optimization trajectories to craft universally effective and stealthy backdoor triggers.
arXiv Detail & Related papers (2025-08-20T08:39:12Z) - IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning [13.337697403537488]
Federated self-supervised learning (FSSL) combines the advantages of decentralized modeling and unlabeled representation learning.<n>Research indicates that FSSL remains vulnerable to backdoor attacks.<n>We propose an imperceptible and effective backdoor attack method against FSSL, called IPBA.
arXiv Detail & Related papers (2025-08-11T14:36:11Z) - SPA: Towards More Stealth and Persistent Backdoor Attacks in Federated Learning [10.924427077035915]
Federated Learning (FL) has emerged as a leading paradigm for privacy-preserving distributed machine learning, yet the distributed nature of FL introduces unique security challenges.<n>We propose a novel and stealthy backdoor attack framework, named SPA, which departs from traditional approaches by leveraging feature-space alignment.<n>Our results call urgent attention to the evolving sophistication of backdoor threats in FL and emphasize the pressing need for advanced, feature-level defense techniques.
arXiv Detail & Related papers (2025-06-26T01:33:14Z) - Robust Anti-Backdoor Instruction Tuning in LVLMs [53.766434746801366]
We introduce a lightweight, certified-agnostic defense framework for large visual language models (LVLMs)<n>Our framework finetunes only adapter modules and text embedding layers under instruction tuning.<n>Experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero.
arXiv Detail & Related papers (2025-06-04T01:23:35Z) - ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models [55.93380086403591]
Generative large language models are vulnerable to backdoor attacks.<n>$textitELBA-Bench$ allows attackers to inject backdoor through parameter efficient fine-tuning.<n>$textitELBA-Bench$ provides over 1300 experiments.
arXiv Detail & Related papers (2025-02-22T12:55:28Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.