ME: Trigger Element Combination Backdoor Attack on Copyright Infringement
- URL: http://arxiv.org/abs/2506.10776v1
- Date: Thu, 12 Jun 2025 14:51:27 GMT
- Title: ME: Trigger Element Combination Backdoor Attack on Copyright Infringement
- Authors: Feiyu Yang, Siyuan Liang, Aishan Liu, Dacheng Tao,
- Abstract summary: SilentBadDiffusion (SBD) is a method proposed recently, which shew outstanding performance in attacking SD in text-to-image tasks.<n>In this paper, we raised new datasets accessible for researching in attacks like SBD, and proposed Multi-Element (ME) attack method based on SBD.<n>The Copyright Infringement Rate (CIR) / First Attack Epoch (FAE) we got on the two new datasets were 16.78% / 39.50 and 51.20% / 23.60, respectively close to or even outperformed benchmark Pokemon and Mijourney datasets.
- Score: 61.06621533874629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The capability of generative diffusion models (DMs) like Stable Diffusion (SD) in replicating training data could be taken advantage of by attackers to launch the Copyright Infringement Attack, with duplicated poisoned image-text pairs. SilentBadDiffusion (SBD) is a method proposed recently, which shew outstanding performance in attacking SD in text-to-image tasks. However, the feasible data resources in this area are still limited, some of them are even constrained or prohibited due to the issues like copyright ownership or inappropriate contents; And not all of the images in current datasets are suitable for the proposed attacking methods; Besides, the state-of-the-art (SoTA) performance of SBD is far from ideal when few generated poisoning samples could be adopted for attacks. In this paper, we raised new datasets accessible for researching in attacks like SBD, and proposed Multi-Element (ME) attack method based on SBD by increasing the number of poisonous visual-text elements per poisoned sample to enhance the ability of attacking, while importing Discrete Cosine Transform (DCT) for the poisoned samples to maintain the stealthiness. The Copyright Infringement Rate (CIR) / First Attack Epoch (FAE) we got on the two new datasets were 16.78% / 39.50 and 51.20% / 23.60, respectively close to or even outperformed benchmark Pokemon and Mijourney datasets. In condition of low subsampling ratio (5%, 6 poisoned samples), MESI and DCT earned CIR / FAE of 0.23% / 84.00 and 12.73% / 65.50, both better than original SBD, which failed to attack at all.
Related papers
- No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z) - CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models [61.06621533874629]
diffusion model is a prime target for copyright infringement attacks.<n>This paper provides an in-depth analysis of the spatial similarity of replication in diffusion model.<n>We propose a novel defense method specifically targeting copyright infringement attacks.
arXiv Detail & Related papers (2024-12-02T14:19:44Z) - On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning [49.17494657762375]
Test-time adaptation (TTA) updates the model weights during the inference stage using testing data to enhance generalization.<n>Existing studies have shown that when TTA is updated with crafted adversarial test samples, the performance on benign samples can deteriorate.<n>We propose an effective and realistic attack method that better produces poisoned samples without access to benign samples.
arXiv Detail & Related papers (2024-10-07T01:29:19Z) - The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline [30.80691226540351]
We formalized the Copyright Infringement Attack on generative AI models and proposed a backdoor attack method, SilentBadDiffusion.
Our method strategically embeds connections between pieces of copyrighted information and text references in poisoning data.
Our experiments show the stealth and efficacy of the poisoning data.
arXiv Detail & Related papers (2024-01-07T08:37:29Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models [19.140908259968302]
We investigate whether BadNets-like data poisoning methods can directly degrade the generation by DMs.
We show that a BadNets-like data poisoning attack remains effective in DMs for producing incorrect images.
Poisoned DMs exhibit an increased ratio of triggers, a phenomenon we refer to as trigger amplification'
arXiv Detail & Related papers (2023-11-04T11:00:31Z) - Autoregressive Perturbations for Data Poisoning [54.205200221427994]
Data scraping from social media has led to growing concerns regarding unauthorized use of data.
Data poisoning attacks have been proposed as a bulwark against scraping.
We introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset.
arXiv Detail & Related papers (2022-06-08T06:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.