Related papers: An h-space Based Adversarial Attack for Protection Against Few-shot Personalization

An h-space Based Adversarial Attack for Protection Against Few-shot Personalization

URL: http://arxiv.org/abs/2507.17554v1
Date: Wed, 23 Jul 2025 14:43:22 GMT
Title: An h-space Based Adversarial Attack for Protection Against Few-shot Personalization
Authors: Xide Xu, Sandesh Kamath, Muhammad Atif Butt, Bogdan Raducanu,
Abstract summary: We propose a novel anti-customization approach, called HAAD, that leverages adversarial attacks to craft perturbations based on the h-space.<n>We introduce a more efficient variant, HAAD-KV, that constructs perturbations solely based on the KV parameters of the h-space.<n>Despite their simplicity, our methods outperform state-of-the-art adversarial attacks, highlighting their effectiveness.
Score: 5.357486699062561
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The versatility of diffusion models in generating customized images from few samples raises significant privacy concerns, particularly regarding unauthorized modifications of private content. This concerning issue has renewed the efforts in developing protection mechanisms based on adversarial attacks, which generate effective perturbations to poison diffusion models. Our work is motivated by the observation that these models exhibit a high degree of abstraction within their semantic latent space (`h-space'), which encodes critical high-level features for generating coherent and meaningful content. In this paper, we propose a novel anti-customization approach, called HAAD (h-space based Adversarial Attack for Diffusion models), that leverages adversarial attacks to craft perturbations based on the h-space that can efficiently degrade the image generation process. Building upon HAAD, we further introduce a more efficient variant, HAAD-KV, that constructs perturbations solely based on the KV parameters of the h-space. This strategy offers a stronger protection, that is computationally less expensive. Despite their simplicity, our methods outperform state-of-the-art adversarial attacks, highlighting their effectiveness.

Related papers

Active Adversarial Noise Suppression for Image Forgery Localization [56.98050814363447]
We introduce an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise.<n>To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks.
arXiv Detail & Related papers (2025-06-15T14:53:27Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
Backdoor Defense in Diffusion Models via Spatial Attention Unlearning [0.0]
Text-to-image diffusion models are increasingly vulnerable to backdoor attacks.<n>We propose Spatial Attention Unlearning (SAU), a novel technique for mitigating backdoor attacks in diffusion models.
arXiv Detail & Related papers (2025-04-21T04:00:19Z)
MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks [21.227398434694724]
We introduce an innovative framework that incorporates a precision-optimized noise predictor to enhance the effectiveness of our attack framework. Our framework provides a cutting-edge solution for multi-modal adversarial attacks, ensuring reduced latency. We demonstrate that our framework achieves outstanding transferability and robustness against purification defenses.
arXiv Detail & Related papers (2024-10-17T23:52:39Z)
Pixel Is Not a Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models [9.905296922309157]
Diffusion Models have emerged as powerful generative models for high-quality image synthesis, with many subsequent image editing techniques based on them.<n>Previous works have attempted to safeguard images from diffusion-based editing by adding imperceptible perturbations.<n>Our work proposes a novel attack framework, AtkPDM, which exploits vulnerabilities in denoising UNets and a latent optimization strategy to enhance the naturalness of adversarial images.
arXiv Detail & Related papers (2024-08-21T17:56:34Z)
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space. Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings. The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z)
Toward effective protection against diffusion based mimicry through score distillation [15.95715097030366]
Efforts have been made to add perturbations to protect images from diffusion-based mimicry pipelines. Most of the existing methods are too ineffective and even impractical to be used by individual users. We present novel findings on attacking latent diffusion models and propose new plug-and-play strategies for more effective protection.
arXiv Detail & Related papers (2023-10-02T18:56:12Z)
LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via Latent Ensemble Attack [11.764601181046496]
Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society. To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs. We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
arXiv Detail & Related papers (2023-07-04T07:00:37Z)
Data Forensics in Diffusion Models: A Systematic Analysis of Membership Privacy [62.16582309504159]
We develop a systematic analysis of membership inference attacks on diffusion models and propose novel attack methods tailored to each attack scenario. Our approach exploits easily obtainable quantities and is highly effective, achieving near-perfect attack performance (>0.9 AUCROC) in realistic scenarios.
arXiv Detail & Related papers (2023-02-15T17:37:49Z)
Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition [111.1952945740271]
Adversarial Attributes (Adv-Attribute) is designed to generate inconspicuous and transferable attacks on face recognition. Experiments on the FFHQ and CelebA-HQ datasets show that the proposed Adv-Attribute method achieves the state-of-the-art attacking success rates.
arXiv Detail & Related papers (2022-10-13T09:56:36Z)
CARBEN: Composite Adversarial Robustness Benchmark [70.05004034081377]
This paper demonstrates how composite adversarial attack (CAA) affects the resulting image. It provides real-time inferences of different models, which will facilitate users' configuration of the parameters of the attack level. A leaderboard to benchmark adversarial robustness against CAA is also introduced.
arXiv Detail & Related papers (2022-07-16T01:08:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.