Perturb a Model, Not an Image: Towards Robust Privacy Protection via Anti-Personalized Diffusion Models
- URL: http://arxiv.org/abs/2511.01307v1
- Date: Mon, 03 Nov 2025 07:42:05 GMT
- Title: Perturb a Model, Not an Image: Towards Robust Privacy Protection via Anti-Personalized Diffusion Models
- Authors: Tae-Young Lee, Juwon Seo, Jong Hwan Ko, Gyeong-Moon Park,
- Abstract summary: Recent advances in diffusion models have enabled high-quality synthesis of specific subjects, such as identities or objects.<n>Personalization techniques can be misused by malicious users to generate unauthorized content.<n>We introduce Direct Protective Optimization (DPO), a novel loss function that effectively disrupts subject personalization in the target model without compromising generative quality.
- Score: 32.903448192224644
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in diffusion models have enabled high-quality synthesis of specific subjects, such as identities or objects. This capability, while unlocking new possibilities in content creation, also introduces significant privacy risks, as personalization techniques can be misused by malicious users to generate unauthorized content. Although several studies have attempted to counter this by generating adversarially perturbed samples designed to disrupt personalization, they rely on unrealistic assumptions and become ineffective in the presence of even a few clean images or under simple image transformations. To address these challenges, we shift the protection target from the images to the diffusion model itself to hinder the personalization of specific subjects, through our novel framework called Anti-Personalized Diffusion Models (APDM). We first provide a theoretical analysis demonstrating that a naive approach of existing loss functions to diffusion models is inherently incapable of ensuring convergence for robust anti-personalization. Motivated by this finding, we introduce Direct Protective Optimization (DPO), a novel loss function that effectively disrupts subject personalization in the target model without compromising generative quality. Moreover, we propose a new dual-path optimization strategy, coined Learning to Protect (L2P). By alternating between personalization and protection paths, L2P simulates future personalization trajectories and adaptively reinforces protection at each step. Experimental results demonstrate that our framework outperforms existing methods, achieving state-of-the-art performance in preventing unauthorized personalization. The code is available at https://github.com/KU-VGI/APDM.
Related papers
- Deep Leakage with Generative Flow Matching Denoiser [54.05993847488204]
We introduce a new deep leakage (DL) attack that integrates a generative Flow Matching (FM) prior into the reconstruction process.<n>Our approach consistently outperforms state-of-the-art attacks across pixel-level, perceptual, and feature-based similarity metrics.
arXiv Detail & Related papers (2026-01-21T14:51:01Z) - Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation [74.5813283875938]
Zero-shot image-to-image generation poses substantial risks related to intellectual property violations.<n>This work presents Adapter Shield, the first universal and authentication-integrated solution aimed at defending personal images from misuse.<n>Our method surpasses existing state-of-the-art defenses in blocking unauthorized zero-shot image synthesis.
arXiv Detail & Related papers (2025-11-25T04:49:16Z) - DLADiff: A Dual-Layer Defense Framework against Fine-Tuning and Zero-Shot Customization of Diffusion Models [74.9510349783152]
Malicious actors can exploit diffusion model customization with just a few or even one image of a person to create synthetic identities nearly identical to the original identity.<n>This paper proposes Dual-Layer Anti-Diffusion (DLADiff) to defend both fine-tuning methods and zero-shot methods.
arXiv Detail & Related papers (2025-11-25T04:35:55Z) - Latent Diffusion Unlearning: Protecting Against Unauthorized Personalization Through Trajectory Shifted Perturbations [18.024767641200064]
We propose a model-based perturbation strategy that operates within the latent space of diffusion models.<n>Our method alternates between denoising and inversion while modifying the starting point of the denoising trajectory: of diffusion models.<n>We validate our approach on four benchmark datasets to demonstrate robustness against state-of-the-art inversion attacks.
arXiv Detail & Related papers (2025-10-03T15:18:45Z) - Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z) - Privacy Protection Against Personalized Text-to-Image Synthesis via Cross-image Consistency Constraints [9.385284914809294]
Cross-image Anti-Personalization (CAP) is a novel framework that enhances resistance to personalization by enforcing style consistency across perturbed images.<n>We develop a dynamic ratio adjustment strategy that adaptively balances the impact of the consistency loss throughout the attack iterations.
arXiv Detail & Related papers (2025-04-17T08:39:32Z) - PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models [51.458089902581456]
We introduce PersGuard, a novel backdoor-based approach that prevents malicious personalization of specific images.<n>Our method significantly outperforms existing techniques, offering a more robust solution for privacy and copyright protection.
arXiv Detail & Related papers (2025-02-22T09:47:55Z) - Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack [5.357486699062561]
We propose a novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM)
For this purpose, we carefully construct an imperceptible noise to be added to clean samples to get their adversarial counterparts.
Experimental validation on a subset of CelebA-HQ face images dataset demonstrates that our approach outperforms existing methods.
arXiv Detail & Related papers (2024-11-25T14:39:18Z) - DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models [18.938687631109925]
Diffusion-based personalized visual content generation technologies have achieved significant breakthroughs.
However, when misused to fabricate fake news or unsettling content targeting individuals, these technologies could cause considerable societal harm.
This paper introduces a novel Dual-Domain Anti-Personalization framework (DDAP)
By alternating between these two methods, we construct the DDAP framework, effectively harnessing the strengths of both domains.
arXiv Detail & Related papers (2024-07-29T16:11:21Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Data Forensics in Diffusion Models: A Systematic Analysis of Membership
Privacy [62.16582309504159]
We develop a systematic analysis of membership inference attacks on diffusion models and propose novel attack methods tailored to each attack scenario.
Our approach exploits easily obtainable quantities and is highly effective, achieving near-perfect attack performance (>0.9 AUCROC) in realistic scenarios.
arXiv Detail & Related papers (2023-02-15T17:37:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.