Related papers: Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models

Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models

URL: http://arxiv.org/abs/2405.14828v1
Date: Thu, 23 May 2024 17:46:23 GMT
Title: Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Authors: Katherine Xu, Lingzhi Zhang, Jianbo Shi,
Abstract summary: We conduct a large-scale scientific study into the impact of random seeds during diffusion inference. We find that the best 'golden' seed achieved an impressive FID of 21.60, compared to the worst 'inferior' seed's FID of 31.97. A classifier can predict the seed number used to generate an image with over 99.9% accuracy in just a few epochs.
Score: 13.4617544015866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in text-to-image (T2I) diffusion models have facilitated creative and photorealistic image synthesis. By varying the random seeds, we can generate various images for a fixed text prompt. Technically, the seed controls the initial noise and, in multi-step diffusion inference, the noise used for reparameterization at intermediate timesteps in the reverse diffusion process. However, the specific impact of the random seed on the generated images remains relatively unexplored. In this work, we conduct a large-scale scientific study into the impact of random seeds during diffusion inference. Remarkably, we reveal that the best 'golden' seed achieved an impressive FID of 21.60, compared to the worst 'inferior' seed's FID of 31.97. Additionally, a classifier can predict the seed number used to generate an image with over 99.9% accuracy in just a few epochs, establishing that seeds are highly distinguishable based on generated images. Encouraged by these findings, we examined the influence of seeds on interpretable visual dimensions. We find that certain seeds consistently produce grayscale images, prominent sky regions, or image borders. Seeds also affect image composition, including object location, size, and depth. Moreover, by leveraging these 'golden' seeds, we demonstrate improved image generation such as high-fidelity inference and diversified sampling. Our investigation extends to inpainting tasks, where we uncover some seeds that tend to insert unwanted text artifacts. Overall, our extensive analyses highlight the importance of selecting good seeds and offer practical utility for image generation.

Related papers

All Seeds Are Not Equal: Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds [63.753710512888965]
Text-to-image diffusion models can generate realistic images from arbitrary text prompts. They often produce inconsistent results for compositional prompts such as "two dogs" or "a penguin on the right of a bowl"
arXiv Detail & Related papers (2024-11-27T23:32:54Z)
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation [58.77994391566484]
We propose W1KP, a human-calibrated measure of variability in a set of images. Our best perceptual distance outperforms nine baselines by up to 18 points in accuracy. We analyze 56 linguistic features of real prompts, finding that the prompt's length, CLIP embedding norm, concreteness, and word senses influence variability most.
arXiv Detail & Related papers (2024-06-12T17:59:27Z)
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise [92.53724347718173]
Diffusion models have achieved remarkable success in text-to-image generation tasks. We identify specific regions within the initial noise image, termed trigger patches, that play a key role for object generation in the resulting images.
arXiv Detail & Related papers (2024-06-04T05:06:00Z)
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation [60.943159830780154]
We introduce Bounded Attention, a training-free method for bounding the information flow in the sampling process. We demonstrate that our method empowers the generation of multiple subjects that better align with given prompts and layouts.
arXiv Detail & Related papers (2024-03-25T17:52:07Z)
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks. To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z)
Diffusion Facial Forgery Detection [56.69763252655695]
This paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images. We conduct extensive experiments on the DiFF dataset via a human test and several representative forgery detection methods. The results demonstrate that the binary detection accuracy of both human observers and automated detectors often falls below 30%.
arXiv Detail & Related papers (2024-01-29T03:20:19Z)
TIAM -- A Metric for Evaluating Alignment in Text-to-Image Generation [2.6890293832784566]
We propose a new metric based on prompt templates to study the alignment between the content specified in the prompt and the corresponding generated images. An additional interesting result we obtained with our approach is that image quality can vary drastically depending on the noise used as a seed for the images.
arXiv Detail & Related papers (2023-07-11T09:23:05Z)
4Weed Dataset: Annotated Imagery Weeds Dataset [1.5484595752241122]
The dataset consists of 159 Cocklebur images, 139 Foxtail images, 170 Redroot Pigweed images and 150 Giant Ragweed images. Bounding box annotations were created for each image to prepare the dataset for training both image classification and object detection deep learning networks.
arXiv Detail & Related papers (2022-03-29T03:10:54Z)
Seed Classification using Synthetic Image Datasets Generated from Low-Altitude UAV Imagery [0.0]
Plant breeding programs extensively monitor the evolution of seed kernels for seed certification. The monitoring of seed kernels can be challenging due to the minuscule size of seed kernels. The article proposes a seed classification framework as a proof-of-concept using the convolutional neural networks of Microsoft's ResNet-100, Oxford's VGG-16, and VGG-19.
arXiv Detail & Related papers (2021-10-06T15:18:17Z)
An effective and friendly tool for seed image analysis [0.0]
This work aims to present a software that performs an image analysis by feature extraction and classification starting from images containing seeds. We propose two emphImageJ plugins, one capable of extracting morphological, textural, and colour characteristics from images of seeds, and another one to classify the seeds into categories by using the extracted features. The experimental results demonstrated the correctness and validity both of the extracted features and the classification predictions.
arXiv Detail & Related papers (2021-03-31T16:56:22Z)
Seed Phenotyping on Neural Networks using Domain Randomization and Transfer Learning [0.0]
Seed phenotyping is the idea of analyzing the morphometric characteristics of a seed to predict its behavior in terms of development, tolerance and yield. The focus of the work is the application and feasibility analysis of the state-of-the-art object detection and localization networks.
arXiv Detail & Related papers (2020-12-24T14:04:28Z)
Random Network Distillation as a Diversity Metric for Both Image and Text Generation [62.13444904851029]
We develop a new diversity metric that can be applied to data, both synthetic and natural, of any type. We validate and deploy this metric on both images and text.
arXiv Detail & Related papers (2020-10-13T22:03:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.