Differential Diffusion: Giving Each Pixel Its Strength
- URL: http://arxiv.org/abs/2306.00950v2
- Date: Wed, 28 Feb 2024 21:10:08 GMT
- Title: Differential Diffusion: Giving Each Pixel Its Strength
- Authors: Eran Levin, Ohad Fried
- Abstract summary: This paper introduces a novel framework that enables customization of the amount of change per pixel or per image region.
Our framework can be integrated into any existing diffusion model, enhancing it with this capability.
We demonstrate our method with the current open state-of-the-art models, and validate it via both quantitative and qualitative comparisons.
- Score: 10.36919027402249
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have revolutionized image generation and editing, producing
state-of-the-art results in conditioned and unconditioned image synthesis.
While current techniques enable user control over the degree of change in an
image edit, the controllability is limited to global changes over an entire
edited region. This paper introduces a novel framework that enables
customization of the amount of change per pixel or per image region. Our
framework can be integrated into any existing diffusion model, enhancing it
with this capability. Such granular control on the quantity of change opens up
a diverse array of new editing capabilities, such as control of the extent to
which individual objects are modified, or the ability to introduce gradual
spatial changes. Furthermore, we showcase the framework's effectiveness in
soft-inpainting -- the completion of portions of an image while subtly
adjusting the surrounding areas to ensure seamless integration. Additionally,
we introduce a new tool for exploring the effects of different change
quantities. Our framework operates solely during inference, requiring no model
training or fine-tuning. We demonstrate our method with the current open
state-of-the-art models, and validate it via both quantitative and qualitative
comparisons, and a user study. Our code is available at:
https://github.com/exx8/differential-diffusion
Related papers
- PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery [10.594261300488546]
We introduce a novel framework for progressive exemplar-driven editing with off-the-shelf diffusion models, dubbed PIXELS.
PIXELS provides granular control over edits, allowing adjustments at the pixel or region level.
We demonstrate that PIXELS delivers high-quality edits efficiently, leading to a notable improvement in quantitative metrics as well as human evaluation.
arXiv Detail & Related papers (2025-01-16T20:26:30Z) - Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing.
Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT)
We propose an automatic method to identify "vital layers" within DiT, crucial for image formation.
Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z) - VASE: Object-Centric Appearance and Shape Manipulation of Real Videos [108.60416277357712]
In this work, we introduce a framework that is object-centric and is designed to control both the object's appearance and, notably, to execute precise and explicit structural modifications on the object.
We build our framework on a pre-trained image-conditioned diffusion model, integrate layers to handle the temporal dimension, and propose training strategies and architectural modifications to enable shape control.
We evaluate our method on the image-driven video editing task showing similar performance to the state-of-the-art, and showcasing novel shape-editing capabilities.
arXiv Detail & Related papers (2024-01-04T18:59:24Z) - Iterative Multi-granular Image Editing using Diffusion Models [20.21694969555533]
We propose EMILIE: Iterative Multi-granular Image Editor.
We introduce a new benchmark dataset to evaluate our newly proposed setting.
arXiv Detail & Related papers (2023-09-01T17:59:29Z) - Uncovering the Disentanglement Capability in Text-to-Image Diffusion
Models [60.63556257324894]
A key desired property of image generative models is the ability to disentangle different attributes.
We propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.
Experiments show that the proposed method can modify a wide range of attributes, with the performance outperforming diffusion-model-based image-editing algorithms.
arXiv Detail & Related papers (2022-12-16T19:58:52Z) - End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change.
We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain.
We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - Look here! A parametric learning based approach to redirect visual
attention [49.609412873346386]
We introduce an automatic method to make an image region more attention-capturing via subtle image edits.
Our model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions.
Our edits enable inference at interactive rates on any image size, and easily generalize to videos.
arXiv Detail & Related papers (2020-08-12T16:08:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.