Cross-domain Compositing with Pretrained Diffusion Models
- URL: http://arxiv.org/abs/2302.10167v2
- Date: Thu, 25 May 2023 06:30:04 GMT
- Title: Cross-domain Compositing with Pretrained Diffusion Models
- Authors: Roy Hachnochi, Mingrui Zhao, Nadav Orzech, Rinon Gal, Ali
Mahdavi-Amiri, Daniel Cohen-Or, Amit Haim Bermano
- Abstract summary: We employ a localized, iterative refinement scheme which infuses the injected objects with contextual information derived from the background scene.
Our method produces higher quality and realistic results without requiring any annotations or training.
- Score: 34.98199766006208
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have enabled high-quality, conditional image editing
capabilities. We propose to expand their arsenal, and demonstrate that
off-the-shelf diffusion models can be used for a wide range of cross-domain
compositing tasks. Among numerous others, these include image blending, object
immersion, texture-replacement and even CG2Real translation or stylization. We
employ a localized, iterative refinement scheme which infuses the injected
objects with contextual information derived from the background scene, and
enables control over the degree and types of changes the object may undergo. We
conduct a range of qualitative and quantitative comparisons to prior work, and
exhibit that our method produces higher quality and realistic results without
requiring any annotations or training. Finally, we demonstrate how our method
may be used for data augmentation of downstream tasks.
Related papers
- Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing.
Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT)
We propose an automatic method to identify "vital layers" within DiT, crucial for image formation.
Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z) - FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image.
We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z) - DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition [13.341996441742374]
DiffPop is a framework that learns the scale and spatial relations among multiple objects and the corresponding scene image.
We develop a human-in-the-loop pipeline which exploits human labeling on the diffusion-generated composite images.
Our dataset and code will be released.
arXiv Detail & Related papers (2024-06-12T03:40:17Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision.
By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images.
We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z) - Training-free Diffusion Model Adaptation for Variable-Sized
Text-to-Image Synthesis [45.19847146506007]
Diffusion models (DMs) have recently gained attention with state-of-the-art performance in text-to-image synthesis.
This paper focuses on adapting text-to-image diffusion models to handle variety while maintaining visual fidelity.
arXiv Detail & Related papers (2023-06-14T17:23:07Z) - Conditional Generation from Unconditional Diffusion Models using
Denoiser Representations [94.04631421741986]
We propose adapting pre-trained unconditional diffusion models to new conditions using the learned internal representations of the denoiser network.
We show that augmenting the Tiny ImageNet training set with synthetic images generated by our approach improves the classification accuracy of ResNet baselines by up to 8%.
arXiv Detail & Related papers (2023-06-02T20:09:57Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.