Diffusion Brush: A Latent Diffusion Model-based Editing Tool for
AI-generated Images
- URL: http://arxiv.org/abs/2306.00219v2
- Date: Fri, 27 Oct 2023 01:51:47 GMT
- Title: Diffusion Brush: A Latent Diffusion Model-based Editing Tool for
AI-generated Images
- Authors: Peyman Gholami and Robert Xiao
- Abstract summary: Text-to-image generative models have made remarkable advancements in generating high-quality images.
Existing techniques to fine-tune generated images are time-consuming (manual editing), produce poorly-integrated results (inpainting), or result in unexpected changes across the entire image.
We present Diffusion Brush, a Latent Diffusion Model-based (LDM) tool to efficiently fine-tune desired regions within an AI-synthesized image.
- Score: 10.323260768204461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image generative models have made remarkable advancements in
generating high-quality images. However, generated images often contain
undesirable artifacts or other errors due to model limitations. Existing
techniques to fine-tune generated images are time-consuming (manual editing),
produce poorly-integrated results (inpainting), or result in unexpected changes
across the entire image (variation selection and prompt fine-tuning). In this
work, we present Diffusion Brush, a Latent Diffusion Model-based (LDM) tool to
efficiently fine-tune desired regions within an AI-synthesized image. Our
method introduces new random noise patterns at targeted regions during the
reverse diffusion process, enabling the model to efficiently make changes to
the specified regions while preserving the original context for the rest of the
image. We evaluate our method's usability and effectiveness through a user
study with artists, comparing our technique against other state-of-the-art
image inpainting techniques and editing software for fine-tuning AI-generated
imagery.
Related papers
- Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing.
Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT)
We propose an automatic method to identify "vital layers" within DiT, crucial for image formation.
Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z) - Time Step Generating: A Universal Synthesized Deepfake Image Detector [0.4488895231267077]
We propose a universal synthetic image detector Time Step Generating (TSG)
TSG does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms.
We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
arXiv Detail & Related papers (2024-11-17T09:39:50Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Active Generation for Image Classification [45.93535669217115]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model.
With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z) - Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior.
Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs)
We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - Uncovering the Disentanglement Capability in Text-to-Image Diffusion
Models [60.63556257324894]
A key desired property of image generative models is the ability to disentangle different attributes.
We propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.
Experiments show that the proposed method can modify a wide range of attributes, with the performance outperforming diffusion-model-based image-editing algorithms.
arXiv Detail & Related papers (2022-12-16T19:58:52Z) - DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models [33.79188588182528]
We present a novel DiffusionCLIP which performs text-driven image manipulation with diffusion models using Contrastive Language-Image Pre-training (CLIP) loss.
Our method has a performance comparable to that of the modern GAN-based image processing methods for in and out-of-domain image processing tasks.
Our method can be easily used for various novel applications, enabling image translation from an unseen domain to another unseen domain or stroke-conditioned image generation in an unseen domain.
arXiv Detail & Related papers (2021-10-06T12:59:39Z) - Automatic Correction of Internal Units in Generative Neural Networks [15.67941936262584]
Generative Adversarial Networks (GANs) have shown satisfactory performance in synthetic image generation.
There exists a number of generated images with defective visual patterns which are known as artifacts.
In this work, we devise a method that automatically identifies the internal units generating various types of artifact images.
arXiv Detail & Related papers (2021-04-13T11:46:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.