Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from
Stable Diffusion
- URL: http://arxiv.org/abs/2308.02552v2
- Date: Tue, 8 Aug 2023 01:30:26 GMT
- Title: Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from
Stable Diffusion
- Authors: Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi
Tian
- Abstract summary: We propose a novel strategy named textbfDegeneration-Tuning (DT) to shield contents of unwanted concepts from SD weights.
As this adaptation occurs at the level of the model's weights, the SD, after DT, can be grafted onto other conditional diffusion frameworks like ControlNet to shield unwanted concepts.
- Score: 106.42918868850249
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Owing to the unrestricted nature of the content in the training data, large
text-to-image diffusion models, such as Stable Diffusion (SD), are capable of
generating images with potentially copyrighted or dangerous content based on
corresponding textual concepts information. This includes specific intellectual
property (IP), human faces, and various artistic styles. However, Negative
Prompt, a widely used method for content removal, frequently fails to conceal
this content due to inherent limitations in its inference logic. In this work,
we propose a novel strategy named \textbf{Degeneration-Tuning (DT)} to shield
contents of unwanted concepts from SD weights. By utilizing Scrambled Grid to
reconstruct the correlation between undesired concepts and their corresponding
image domain, we guide SD to generate meaningless content when such textual
concepts are provided as input. As this adaptation occurs at the level of the
model's weights, the SD, after DT, can be grafted onto other conditional
diffusion frameworks like ControlNet to shield unwanted concepts. In addition
to qualitatively showcasing the effectiveness of our DT method in protecting
various types of concepts, a quantitative comparison of the SD before and after
DT indicates that the DT method does not significantly impact the generative
quality of other contents. The FID and IS scores of the model on COCO-30K
exhibit only minor changes after DT, shifting from 12.61 and 39.20 to 13.04 and
38.25, respectively, which clearly outperforms the previous methods.
Related papers
- Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation [22.3077678575067]
Diffusion models excel at generating visually striking content from text but can inadvertently produce undesirable or harmful content when trained on unfiltered internet data.
We propose to identify and preserving concepts most affected by parameter changes, termed as textitadversarial concepts.
We demonstrate the effectiveness of our method using the Stable Diffusion model, showing that it outperforms state-of-the-art erasure methods in eliminating unwanted content.
arXiv Detail & Related papers (2024-10-21T03:40:29Z) - EIUP: A Training-Free Approach to Erase Non-Compliant Concepts Conditioned on Implicit Unsafe Prompts [32.590822043053734]
Non-toxic text still carries a risk of generating non-compliant images, which is referred to as implicit unsafe prompts.
We propose a simple yet effective approach that incorporates non-compliant concepts into an erasure prompt.
Our method exhibits superior erasure effectiveness while achieving high scores in image fidelity.
arXiv Detail & Related papers (2024-08-02T05:17:14Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models [58.74606272936636]
Text-to-image (T2I) diffusion models have shown exceptional capabilities in generating images that closely correspond to textual prompts.
The models could be exploited for malicious purposes, such as generating images with violence or nudity, or creating unauthorized portraits of public figures in inappropriate contexts.
concept removal methods have been proposed to modify diffusion models to prevent the generation of malicious and unwanted concepts.
arXiv Detail & Related papers (2024-06-21T03:58:44Z) - CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection [22.82243087156918]
Co-salient object detection (CoSOD) aims to identify the common and salient (usually in the foreground) regions across a given group of images.
adversarial perturbations could be easily affected by some adversarial perturbations, leading to substantial accuracy reduction.
We propose a novel robustness enhancement framework by first learning the concept of the co-salient objects based on the input group images.
arXiv Detail & Related papers (2024-03-27T13:33:14Z) - Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing [58.48890547818074]
We present a powerful modification of Contrastive Denoising Score (CUT) for latent diffusion models (LDM)
Our approach enables zero-shot imageto-image translation and neural field (NeRF) editing, achieving structural correspondence between the input and output.
arXiv Detail & Related papers (2023-11-30T15:06:10Z) - Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework.
We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z) - Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion
Models [63.20512617502273]
We propose a method called SDD to prevent problematic content generation in text-to-image diffusion models.
Our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality.
arXiv Detail & Related papers (2023-07-12T07:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.