Preserving Image Properties Through Initializations in Diffusion Models
- URL: http://arxiv.org/abs/2401.02097v1
- Date: Thu, 4 Jan 2024 06:55:49 GMT
- Title: Preserving Image Properties Through Initializations in Diffusion Models
- Authors: Jeffrey Zhang, Shao-Yu Chang, Kedan Li, David Forsyth
- Abstract summary: We show that Stable Diffusion methods, as currently applied, do not respect requirements of retail photography.
The usual practice of training the denoiser with a very noisy image leads to inconsistent generated images during inference.
A network trained with centered retail product images with uniform backgrounds generates images with erratic backgrounds.
Our procedure can interact well with other control-based methods to further enhance the controllability of diffusion-based methods.
- Score: 6.804700416902898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retail photography imposes specific requirements on images. For instance,
images may need uniform background colors, consistent model poses, centered
products, and consistent lighting. Minor deviations from these standards impact
a site's aesthetic appeal, making the images unsuitable for use. We show that
Stable Diffusion methods, as currently applied, do not respect these
requirements. The usual practice of training the denoiser with a very noisy
image and starting inference with a sample of pure noise leads to inconsistent
generated images during inference. This inconsistency occurs because it is easy
to tell the difference between samples of the training and inference
distributions. As a result, a network trained with centered retail product
images with uniform backgrounds generates images with erratic backgrounds. The
problem is easily fixed by initializing inference with samples from an
approximation of noisy images. However, in using such an approximation, the
joint distribution of text and noisy image at inference time still slightly
differs from that at training time. This discrepancy is corrected by training
the network with samples from the approximate noisy image distribution.
Extensive experiments on real application data show significant qualitative and
quantitative improvements in performance from adopting these procedures.
Finally, our procedure can interact well with other control-based methods to
further enhance the controllability of diffusion-based methods.
Related papers
- Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models.
We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Representing Noisy Image Without Denoising [91.73819173191076]
Fractional-order Moments in Radon space (FMR) is designed to derive robust representation directly from noisy images.
Unlike earlier integer-order methods, our work is a more generic design taking such classical methods as special cases.
arXiv Detail & Related papers (2023-01-18T10:13:29Z) - Markup-to-Image Diffusion Models with Scheduled Sampling [111.30188533324954]
Building on recent advances in image generation, we present a data-driven approach to rendering markup into images.
The approach is based on diffusion models, which parameterize the distribution of data using a sequence of denoising operations.
We conduct experiments on four markup datasets: mathematical formulas (La), table layouts (HTML), sheet music (LilyPond), and molecular images (SMILES)
arXiv Detail & Related papers (2022-10-11T04:56:12Z) - Low-Light Image Enhancement with Normalizing Flow [92.52290821418778]
In this paper, we investigate to model this one-to-many relationship via a proposed normalizing flow model.
An invertible network that takes the low-light images/features as the condition and learns to map the distribution of normally exposed images into a Gaussian distribution.
The experimental results on the existing benchmark datasets show our method achieves better quantitative and qualitative results, obtaining better-exposed illumination, less noise and artifact, and richer colors.
arXiv Detail & Related papers (2021-09-13T12:45:08Z) - A low-rank representation for unsupervised registration of medical
images [10.499611180329804]
We propose a novel approach based on a low-rank representation, i.e., Regnet-LRR, to tackle the problem of noisy data registration scenarios.
We show that the low-rank representation can boost the ability and robustness of models as well as bring significant improvements in noisy data registration scenarios.
arXiv Detail & Related papers (2021-05-20T07:04:10Z) - Transform consistency for learning with noisy labels [9.029861710944704]
We propose a method to identify clean samples only using one single network.
Clean samples prefer to reach consistent predictions for the original images and the transformed images.
In order to mitigate the negative influence of noisy labels, we design a classification loss by using the off-line hard labels and on-line soft labels.
arXiv Detail & Related papers (2021-03-25T14:33:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.