FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
- URL: http://arxiv.org/abs/2405.17472v2
- Date: Wed, 27 Nov 2024 04:43:01 GMT
- Title: FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
- Authors: Kai Huang, Haoming Wang, Wei Gao,
- Abstract summary: We present FreezeAsGuard, a technique that enables irreversible mitigation of illegal adaptations of diffusion models.
Experiment results show that FreezeAsGuard provides 37% stronger power in mitigating illegal model adaptations compared to competitive baselines.
- Score: 9.598086319369694
- License:
- Abstract: Text-to-image diffusion models can be fine-tuned in custom domains to adapt to specific user preferences, but such adaptability has also been utilized for illegal purposes, such as forging public figures' portraits, duplicating copyrighted artworks and generating explicit contents. Existing work focused on detecting the illegally generated contents, but cannot prevent or mitigate illegal adaptations of diffusion models. Other schemes of model unlearning and reinitialization, similarly, cannot prevent users from relearning the knowledge of illegal model adaptation with custom data. In this paper, we present FreezeAsGuard, a new technique that addresses these limitations and enables irreversible mitigation of illegal adaptations of diffusion models. Our approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations. Experiment results in multiple text-to-image application domains show that FreezeAsGuard provides 37% stronger power in mitigating illegal model adaptations compared to competitive baselines, while incurring less than 5% impact on legal model adaptations. The source code is available at: https://github.com/pittisl/FreezeAsGuard.
Related papers
- SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a novel framework designed to embed resilient watermarks into T2I diffusion models.
It guides the model to disentangle the watermark information from the semantic concepts it learns, allowing the model to retain the embedded watermark.
Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z) - Safety Alignment Backfires: Preventing the Re-emergence of Suppressed Concepts in Fine-tuned Text-to-Image Diffusion Models [57.16056181201623]
Fine-tuning text-to-image diffusion models can inadvertently undo safety measures, causing models to relearn harmful concepts.
We present a novel but immediate solution called Modular LoRA, which involves training Safety Low-Rank Adaptation modules separately from Fine-Tuning LoRA components.
This method effectively prevents the re-learning of harmful content without compromising the model's performance on new tasks.
arXiv Detail & Related papers (2024-11-30T04:37:38Z) - Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights [0.10878040851638002]
We study the issue of privacy leakage of a fine-tuned diffusion model in a practical setting.
An adversary can generate images containing the same identities as the private images.
arXiv Detail & Related papers (2024-09-13T02:13:26Z) - Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models [71.13610023354967]
Copyright protection and inappropriate content generation pose challenges for the practical implementation of diffusion models.
We propose a diffusion model watermarking technique that is both performance-lossless and training-free.
arXiv Detail & Related papers (2024-04-07T13:30:10Z) - IMMA: Immunizing text-to-image Models against Malicious Adaptation [11.912092139018885]
Open-sourced text-to-image models and fine-tuning methods have led to the increasing risk of malicious adaptation, i.e., fine-tuning to generate harmful/unauthorized content.
We propose to immunize'' the model by learning model parameters that are difficult for the adaptation methods when fine-tuning malicious content; in short IMMA.
Empirical results show IMMA's effectiveness against malicious adaptations, including mimicking the artistic style and learning of inappropriate/unauthorized content.
arXiv Detail & Related papers (2023-11-30T18:55:16Z) - Diffusion-TTA: Test-time Adaptation of Discriminative Models via
Generative Feedback [97.0874638345205]
generative models can be great test-time adapters for discriminative models.
Our method, Diffusion-TTA, adapts pre-trained discriminative models to each unlabelled example in the test set.
We show Diffusion-TTA significantly enhances the accuracy of various large-scale pre-trained discriminative models.
arXiv Detail & Related papers (2023-11-27T18:59:53Z) - DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models [79.71665540122498]
We propose a method for detecting unauthorized data usage by planting the injected content into the protected dataset.
Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions.
By analyzing whether the model has memorized the injected content, we can detect models that had illegally utilized the unauthorized data.
arXiv Detail & Related papers (2023-07-06T16:27:39Z) - AdaptGuard: Defending Against Universal Attacks for Model Adaptation [129.2012687550069]
We study the vulnerability to universal attacks transferred from the source domain during model adaptation algorithms.
We propose a model preprocessing framework, named AdaptGuard, to improve the security of model adaptation algorithms.
arXiv Detail & Related papers (2023-03-19T07:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.