Diverse Diffusion: Enhancing Image Diversity in Text-to-Image Generation
- URL: http://arxiv.org/abs/2310.12583v1
- Date: Thu, 19 Oct 2023 08:48:23 GMT
- Title: Diverse Diffusion: Enhancing Image Diversity in Text-to-Image Generation
- Authors: Mariia Zameshina (LIGM), Olivier Teytaud (TAU), Laurent Najman (LIGM)
- Abstract summary: We introduce Diverse Diffusion, a method for boosting image diversity beyond gender and ethnicity.
Our approach contributes to the creation of more inclusive and representative AI-generated art.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Latent diffusion models excel at producing high-quality images from text.
Yet, concerns appear about the lack of diversity in the generated imagery. To
tackle this, we introduce Diverse Diffusion, a method for boosting image
diversity beyond gender and ethnicity, spanning into richer realms, including
color diversity.Diverse Diffusion is a general unsupervised technique that can
be applied to existing text-to-image models. Our approach focuses on finding
vectors in the Stable Diffusion latent space that are distant from each other.
We generate multiple vectors in the latent space until we find a set of vectors
that meets the desired distance requirements and the required batch size.To
evaluate the effectiveness of our diversity methods, we conduct experiments
examining various characteristics, including color diversity, LPIPS metric, and
ethnicity/gender representation in images featuring humans.The results of our
experiments emphasize the significance of diversity in generating realistic and
varied images, offering valuable insights for improving text-to-image models.
Through the enhancement of image diversity, our approach contributes to the
creation of more inclusive and representative AI-generated art.
Related papers
- GRADE: Quantifying Sample Diversity in Text-to-Image Models [66.12068246962762]
We propose GRADE: Granular Attribute Diversity Evaluation, an automatic method for quantifying sample diversity.
We measure the overall diversity of 12 T2I models using 400 concept-attribute pairs, revealing that all models display limited variation.
Our work proposes a modern, semantically-driven approach to measure sample diversity and highlights the stunning homogeneity in outputs by T2I models.
arXiv Detail & Related papers (2024-10-29T23:10:28Z) - Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance [12.33170407159189]
State-of-the-art text-to-image generative models struggle to depict everyday objects with the true diversity of the real world.
We introduce an inference time intervention, contextualized Vendi Score Guidance (c-VSG), that guides the backwards steps of latent diffusion models to increase the diversity of a sample.
We find that c-VSG substantially increases the diversity of generated images, both for the worst performing regions and on average, while simultaneously maintaining or improving image quality and consistency.
arXiv Detail & Related papers (2024-06-06T23:35:51Z) - Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior [63.64088590653005]
We propose Diff-Mosaic, a data augmentation method based on the diffusion model.
We introduce an enhancement network called Pixel-Prior, which generates highly coordinated and realistic Mosaic images.
In the second stage, we propose an image enhancement strategy named Diff-Prior. This strategy utilizes diffusion priors to model images in the real-world scene.
arXiv Detail & Related papers (2024-06-02T06:23:05Z) - Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling [49.41822427811098]
We present Kaleido, a novel approach that enhances the diversity of samples by incorporating autoregressive latent priors.
Kaleido integrates an autoregressive language model that encodes the original caption and generates latent variables.
We show that Kaleido adheres closely to the guidance provided by the generated latent variables, demonstrating its capability to effectively control and direct the image generation process.
arXiv Detail & Related papers (2024-05-31T17:41:11Z) - Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images.
We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL)
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Few-shot Image Generation via Masked Discrimination [20.998032566820907]
Few-shot image generation aims to generate images of high quality and great diversity with limited data.
It is difficult for modern GANs to avoid overfitting when trained on only a few images.
This work presents a novel approach to realize few-shot GAN adaptation via masked discrimination.
arXiv Detail & Related papers (2022-10-27T06:02:22Z) - Random Network Distillation as a Diversity Metric for Both Image and
Text Generation [62.13444904851029]
We develop a new diversity metric that can be applied to data, both synthetic and natural, of any type.
We validate and deploy this metric on both images and text.
arXiv Detail & Related papers (2020-10-13T22:03:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.