Related papers: Toward effective protection against diffusion based mimicry through score distillation

Toward effective protection against diffusion based mimicry through score distillation

URL: http://arxiv.org/abs/2311.12832v2
Date: Sat, 3 Feb 2024 22:22:02 GMT
Title: Toward effective protection against diffusion based mimicry through score distillation
Authors: Haotian Xue, Chumeng Liang, Xiaoyu Wu, Yongxin Chen
Abstract summary: Efforts have been made to add perturbations to protect images from diffusion-based mimicry pipelines. Most of the existing methods are too ineffective and even impractical to be used by individual users. We present novel findings on attacking latent diffusion models and propose new plug-and-play strategies for more effective protection.
Score: 15.95715097030366
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While generative diffusion models excel in producing high-quality images, they can also be misused to mimic authorized images, posing a significant threat to AI systems. Efforts have been made to add calibrated perturbations to protect images from diffusion-based mimicry pipelines. However, most of the existing methods are too ineffective and even impractical to be used by individual users due to their high computation and memory requirements. In this work, we present novel findings on attacking latent diffusion models (LDM) and propose new plug-and-play strategies for more effective protection. In particular, we explore the bottleneck in attacking an LDM, discovering that the encoder module rather than the denoiser module is the vulnerable point. Based on this insight, we present our strategy using Score Distillation Sampling (SDS) to double the speed of protection and reduce memory occupation by half without compromising its strength. Additionally, we provide a robust protection strategy by counterintuitively minimizing the semantic loss, which can assist in generating more natural perturbations. Finally, we conduct extensive experiments to substantiate our findings and comprehensively evaluate our newly proposed strategies. We hope our insights and protective measures can contribute to better defense against malicious diffusion-based mimicry, advancing the development of secure AI systems. The code is available in https://github.com/xavihart/Diff-Protect

Related papers

An h-space Based Adversarial Attack for Protection Against Few-shot Personalization [5.357486699062561]
We propose a novel anti-customization approach, called HAAD, that leverages adversarial attacks to craft perturbations based on the h-space.<n>We introduce a more efficient variant, HAAD-KV, that constructs perturbations solely based on the KV parameters of the h-space.<n>Despite their simplicity, our methods outperform state-of-the-art adversarial attacks, highlighting their effectiveness.
arXiv Detail & Related papers (2025-07-23T14:43:22Z)
Active Adversarial Noise Suppression for Image Forgery Localization [56.98050814363447]
We introduce an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise.<n>To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks.
arXiv Detail & Related papers (2025-06-15T14:53:27Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
CAT: Contrastive Adversarial Training for Evaluating the Robustness of Protective Perturbations in Latent Diffusion Models [15.363134355805764]
Adversarial examples as protective perturbations have been developed to defend against unauthorized data usage. We propose the Contrastive Adversarial Training (CAT) utilizing adapters as an adaptive attack against these protection methods.
arXiv Detail & Related papers (2025-02-11T03:35:35Z)
Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models [9.548195579003897]
We introduce pre-training to reduce latency and propose a mixture-of-perturbations approach to minimize performance degradation. Our novel training strategy computes protection loss across multiple VAE feature spaces, while adaptive targeted protection at inference enhances robustness. Experiments show comparable protection performance with improved invisibility and drastically reduced inference time.
arXiv Detail & Related papers (2024-12-16T03:46:45Z)
CALoR: Towards Comprehensive Model Inversion Defense [43.2642796582236]
Model Inversion Attacks (MIAs) aim at recovering privacy-sensitive training data from the knowledge encoded in released machine learning models. Recent advances in the MIA field have significantly enhanced the attack performance under multiple scenarios. We propose a robust defense mechanism, integrating Confidence Adaptation and Low-Rank compression.
arXiv Detail & Related papers (2024-10-08T08:44:01Z)
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing [93.45507533317405]
DiffusionGuard is a robust and effective defense method against unauthorized edits by diffusion-based image editing models. We introduce a novel objective that generates adversarial noise targeting the early stage of the diffusion process. We also introduce a mask-augmentation technique to enhance robustness against various masks during test time.
arXiv Detail & Related papers (2024-10-08T05:19:19Z)
Celtibero: Robust Layered Aggregation for Federated Learning [0.0]
We introduce Celtibero, a novel defense mechanism that integrates layered aggregation to enhance robustness against adversarial manipulation. We demonstrate that Celtibero consistently achieves high main task accuracy (MTA) while maintaining minimal attack success rates (ASR) across a range of untargeted and targeted poisoning attacks.
arXiv Detail & Related papers (2024-08-26T12:54:00Z)
Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models [9.905296922309157]
Diffusion Models have emerged as powerful generative models for high-quality image synthesis, with many subsequent image editing techniques based on them. Previous works have attempted to safeguard images from diffusion-based editing by adding imperceptible perturbations. Our work proposes a novel attacking framework with a feature representation attack loss that exploits vulnerabilities in denoising UNets and a latent optimization strategy to enhance the naturalness of protected images.
arXiv Detail & Related papers (2024-08-21T17:56:34Z)
PID: Prompt-Independent Data Protection Against Latent Diffusion Models [32.1299481922554]
Given the vast amount of personal images accessible online, this capability raises critical concerns about civil privacy. We propose a simple yet effective method called textbfPrompt-Independent Defense (PID) to safeguard privacy against LDMs.
arXiv Detail & Related papers (2024-06-14T11:56:42Z)
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models. We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack. Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z)
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z)
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal. Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths. Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z)
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think [14.583181596370386]
Adversarial examples for diffusion models are widely used as solutions for safety concerns. This may mislead us to think that the diffusion models are vulnerable to adversarial attacks like most deep models. In this paper, we show novel findings that: even though gradient-based white-box attacks can be used to attack the LDMs, they fail to attack PDMs.
arXiv Detail & Related papers (2024-04-20T08:28:43Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
DiffProtect: Generate Adversarial Examples with Diffusion Models for Facial Privacy Protection [64.77548539959501]
DiffProtect produces more natural-looking encrypted images than state-of-the-art methods. It achieves significantly higher attack success rates, e.g., 24.5% and 25.1% absolute improvements on the CelebA-HQ and FFHQ datasets.
arXiv Detail & Related papers (2023-05-23T02:45:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.