CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
- URL: http://arxiv.org/abs/2412.01528v1
- Date: Mon, 02 Dec 2024 14:19:44 GMT
- Title: CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
- Authors: Zhixiang Guo, Siyuan Liang, Aishan Liu, Dacheng Tao,
- Abstract summary: diffusion model is a prime target for copyright infringement attacks.
This paper provides an in-depth analysis of the spatial similarity of replication in diffusion model.
We propose a novel defense method specifically targeting copyright infringement attacks.
- Score: 61.06621533874629
- License:
- Abstract: The diffusion model has gained significant attention due to its remarkable data generation ability in fields such as image synthesis. However, its strong memorization and replication abilities with respect to the training data also make it a prime target for copyright infringement attacks. This paper provides an in-depth analysis of the spatial similarity of replication in diffusion model and leverages this key characteristic to design a method for detecting poisoning data. By employing a joint assessment of spatial-level and feature-level information from the detected segments, we effectively identify covertly dispersed poisoned samples. Building upon detected poisoning data, we propose a novel defense method specifically targeting copyright infringement attacks by introducing a protection constraint term into the loss function to mitigate the impact of poisoning. Extensive experimental results demonstrate that our approach achieves an average F1 score of 0.709 in detecting copyright infringement backdoors, resulting in an average increase of 68.1% in First-Attack Epoch (FAE) and an average decrease of 51.4% in Copyright Infringement Rate (CIR) of the poisoned model, effectively defending against copyright infringement. Additionally, we introduce the concept of copyright feature inversion, which aids in determining copyright responsibility and expands the application scenarios of defense strategies.
Related papers
- SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning [21.73177249075515]
Federated learning (FL) enables collaborative model training while preserving data privacy, but its decentralized nature exposes it to client-side data poisoning attacks (DPAs) and model poisoning attacks (MPAs)
This paper aims to provide a unified benchmark and analysis of defenses against DPAs and MPAs, clarifying the distinction between these two similar but slightly distinct domains.
arXiv Detail & Related papers (2025-02-06T06:05:00Z) - CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense [61.78357530675446]
Humans are difficult to be cheated by subtle manipulations, since we make judgments only based on essential factors.
Inspired by this observation, we attempt to model label generation with essential label-causative factors and incorporate label-non-causative factors to assist data generation.
For an adversarial example, we aim to discriminate perturbations as non-causative factors and make predictions only based on the label-causative factors.
arXiv Detail & Related papers (2024-10-30T15:06:44Z) - Evaluating Copyright Takedown Methods for Language Models [100.38129820325497]
Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material.
This paper introduces the first evaluation of the feasibility and side effects of copyright takedowns for LMs.
We examine several strategies, including adding system prompts, decoding-time filtering interventions, and unlearning approaches.
arXiv Detail & Related papers (2024-06-26T18:09:46Z) - EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - CPR: Retrieval Augmented Generation for Copyright Protection [101.15323302062562]
We introduce CopyProtected generation with Retrieval (CPR), a new method for RAG with strong copyright protection guarantees.
CPR allows to condition the output of diffusion models on a set of retrieved images.
We prove that CPR satisfies Near Access Freeness (NAF) which bounds the amount of information an attacker may be able to extract from the generated images.
arXiv Detail & Related papers (2024-03-27T18:09:55Z) - Have You Poisoned My Data? Defending Neural Networks against Data Poisoning [0.393259574660092]
We propose a novel approach to detect and filter poisoned datapoints in the transfer learning setting.
We show that effective poisons can be successfully differentiated from clean points in the characteristic vector space.
Our evaluation shows that our proposal outperforms existing approaches in defense rate and final trained model performance.
arXiv Detail & Related papers (2024-03-20T11:50:16Z) - The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline [30.80691226540351]
We formalized the Copyright Infringement Attack on generative AI models and proposed a backdoor attack method, SilentBadDiffusion.
Our method strategically embeds connections between pieces of copyrighted information and text references in poisoning data.
Our experiments show the stealth and efficacy of the poisoning data.
arXiv Detail & Related papers (2024-01-07T08:37:29Z) - DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via
Diffusion Models [12.42597979026873]
We propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets.
We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones.
Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy.
arXiv Detail & Related papers (2023-12-18T09:40:38Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - FedCC: Robust Federated Learning against Model Poisoning Attacks [0.0]
Federated learning is a distributed framework designed to address privacy concerns.
It introduces new attack surfaces, which are especially prone when data is non-Independently and Identically Distributed.
We present FedCC, a simple yet effective novel defense algorithm against model poisoning attacks.
arXiv Detail & Related papers (2022-12-05T01:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.