Generating images of rare concepts using pre-trained diffusion models
- URL: http://arxiv.org/abs/2304.14530v3
- Date: Wed, 27 Dec 2023 07:42:21 GMT
- Title: Generating images of rare concepts using pre-trained diffusion models
- Authors: Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik
- Abstract summary: Text-to-image diffusion models can synthesize high-quality images, but they have various limitations.
We show that their limitation is partly due to the long-tail nature of their training data.
We show that rare concepts can be correctly generated by carefully selecting suitable generation seeds in the noise space.
- Score: 32.5337654536764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image diffusion models can synthesize high-quality images, but they
have various limitations. Here we highlight a common failure mode of these
models, namely, generating uncommon concepts and structured concepts like hand
palms. We show that their limitation is partly due to the long-tail nature of
their training data: web-crawled data sets are strongly unbalanced, causing
models to under-represent concepts from the tail of the distribution. We
characterize the effect of unbalanced training data on text-to-image models and
offer a remedy. We show that rare concepts can be correctly generated by
carefully selecting suitable generation seeds in the noise space, using a small
reference set of images, a technique that we call SeedSelect. SeedSelect does
not require retraining or finetuning the diffusion model. We assess the
faithfulness, quality and diversity of SeedSelect in creating rare objects and
generating complex formations like hand images, and find it consistently
achieves superior performance. We further show the advantage of SeedSelect in
semantic data augmentation. Generating semantically appropriate images can
successfully improve performance in few-shot recognition benchmarks, for
classes from the head and from the tail of the training data of diffusion
models
Related papers
- Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion [21.252145402613472]
This work addresses the challenge of quantifying originality in text-to-image (T2I) generative diffusion models.
We propose a method that leverages textual inversion to measure the originality of an image based on the number of tokens required for its reconstruction by the model.
arXiv Detail & Related papers (2024-08-15T14:42:02Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Large-scale Reinforcement Learning for Diffusion Models [30.164571425479824]
Text-to-image diffusion models are susceptible to implicit biases that arise from web-scale text-image training pairs.
We present an effective scalable algorithm to improve diffusion models using Reinforcement Learning (RL)
We show how our approach substantially outperforms existing methods for aligning diffusion models with human preferences.
arXiv Detail & Related papers (2024-01-20T08:10:43Z) - Conditional Image Generation with Pretrained Generative Model [1.4685355149711303]
diffusion models have gained popularity for their ability to generate higher-quality images in comparison to GAN models.
These models require a huge amount of data, computational resources, and meticulous tuning for successful training.
We propose methods to leverage pre-trained unconditional diffusion models with additional guidance for the purpose of conditional image generative.
arXiv Detail & Related papers (2023-12-20T18:27:53Z) - Aligning Text-to-Image Diffusion Models with Reward Backpropagation [62.45086888512723]
We propose AlignProp, a method that aligns diffusion models to downstream reward functions using end-to-end backpropagation of the reward gradient.
We show AlignProp achieves higher rewards in fewer training steps than alternatives, while being conceptually simpler.
arXiv Detail & Related papers (2023-10-05T17:59:18Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Conditional Generation from Unconditional Diffusion Models using
Denoiser Representations [94.04631421741986]
We propose adapting pre-trained unconditional diffusion models to new conditions using the learned internal representations of the denoiser network.
We show that augmenting the Tiny ImageNet training set with synthetic images generated by our approach improves the classification accuracy of ResNet baselines by up to 8%.
arXiv Detail & Related papers (2023-06-02T20:09:57Z) - Extracting Training Data from Diffusion Models [77.11719063152027]
We show that diffusion models memorize individual images from their training data and emit them at generation time.
With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models.
We train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy.
arXiv Detail & Related papers (2023-01-30T18:53:09Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.