Related papers: Generating images of rare concepts using pre-trained diffusion models

Generating images of rare concepts using pre-trained diffusion models

URL: http://arxiv.org/abs/2304.14530v3
Date: Wed, 27 Dec 2023 07:42:21 GMT
Title: Generating images of rare concepts using pre-trained diffusion models
Authors: Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik
Abstract summary: Text-to-image diffusion models can synthesize high-quality images, but they have various limitations. We show that their limitation is partly due to the long-tail nature of their training data. We show that rare concepts can be correctly generated by carefully selecting suitable generation seeds in the noise space.
Score: 32.5337654536764
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image diffusion models can synthesize high-quality images, but they have various limitations. Here we highlight a common failure mode of these models, namely, generating uncommon concepts and structured concepts like hand palms. We show that their limitation is partly due to the long-tail nature of their training data: web-crawled data sets are strongly unbalanced, causing models to under-represent concepts from the tail of the distribution. We characterize the effect of unbalanced training data on text-to-image models and offer a remedy. We show that rare concepts can be correctly generated by carefully selecting suitable generation seeds in the noise space, using a small reference set of images, a technique that we call SeedSelect. SeedSelect does not require retraining or finetuning the diffusion model. We assess the faithfulness, quality and diversity of SeedSelect in creating rare objects and generating complex formations like hand images, and find it consistently achieves superior performance. We further show the advantage of SeedSelect in semantic data augmentation. Generating semantically appropriate images can successfully improve performance in few-shot recognition benchmarks, for classes from the head and from the tail of the training data of diffusion models

Related papers

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression [7.859083902013309]
Diffusion models generate high-quality images through progressive denoising but are computationally intensive due to large model sizes and repeated sampling. We propose Random Conditioning, a novel approach that pairs noised images with randomly selected text conditions to enable efficient, image-free knowledge distillation. Our method allows the student to explore the condition space without generating condition-specific images, resulting in notable improvements in both generation quality and efficiency.
arXiv Detail & Related papers (2025-04-02T05:41:19Z)
Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion [21.252145402613472]
This work addresses the challenge of quantifying originality in text-to-image (T2I) generative diffusion models. We propose a method that leverages textual inversion to measure the originality of an image based on the number of tokens required for its reconstruction by the model.
arXiv Detail & Related papers (2024-08-15T14:42:02Z)
Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z)
Large-scale Reinforcement Learning for Diffusion Models [30.164571425479824]
Text-to-image diffusion models are susceptible to implicit biases that arise from web-scale text-image training pairs. We present an effective scalable algorithm to improve diffusion models using Reinforcement Learning (RL) We show how our approach substantially outperforms existing methods for aligning diffusion models with human preferences.
arXiv Detail & Related papers (2024-01-20T08:10:43Z)
Conditional Image Generation with Pretrained Generative Model [1.4685355149711303]
diffusion models have gained popularity for their ability to generate higher-quality images in comparison to GAN models. These models require a huge amount of data, computational resources, and meticulous tuning for successful training. We propose methods to leverage pre-trained unconditional diffusion models with additional guidance for the purpose of conditional image generative.
arXiv Detail & Related papers (2023-12-20T18:27:53Z)
Aligning Text-to-Image Diffusion Models with Reward Backpropagation [62.45086888512723]
We propose AlignProp, a method that aligns diffusion models to downstream reward functions using end-to-end backpropagation of the reward gradient. We show AlignProp achieves higher rewards in fewer training steps than alternatives, while being conceptually simpler.
arXiv Detail & Related papers (2023-10-05T17:59:18Z)
Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation. We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z)
Conditional Generation from Unconditional Diffusion Models using Denoiser Representations [94.04631421741986]
We propose adapting pre-trained unconditional diffusion models to new conditions using the learned internal representations of the denoiser network. We show that augmenting the Tiny ImageNet training set with synthetic images generated by our approach improves the classification accuracy of ResNet baselines by up to 8%.
arXiv Detail & Related papers (2023-06-02T20:09:57Z)
Extracting Training Data from Diffusion Models [77.11719063152027]
We show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models. We train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy.
arXiv Detail & Related papers (2023-01-30T18:53:09Z)
Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image. We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively. Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.