FreePIH: Training-Free Painterly Image Harmonization with Diffusion
Model
- URL: http://arxiv.org/abs/2311.14926v1
- Date: Sat, 25 Nov 2023 04:23:49 GMT
- Title: FreePIH: Training-Free Painterly Image Harmonization with Diffusion
Model
- Authors: Ruibin Li, Jingcai Guo, Song Guo, Qihua Zhou, Jie Zhang
- Abstract summary: Our FreePIH method tames the denoising process as a plug-in module for foreground image style transfer.
We make use of multi-scale features to enforce the consistency of the content and stability of the foreground objects in the latent space.
Our method can surpass representative baselines by large margins.
- Score: 19.170302996189335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides an efficient training-free painterly image harmonization
(PIH) method, dubbed FreePIH, that leverages only a pre-trained diffusion model
to achieve state-of-the-art harmonization results. Unlike existing methods that
require either training auxiliary networks or fine-tuning a large pre-trained
backbone, or both, to harmonize a foreground object with a painterly-style
background image, our FreePIH tames the denoising process as a plug-in module
for foreground image style transfer. Specifically, we find that the very last
few steps of the denoising (i.e., generation) process strongly correspond to
the stylistic information of images, and based on this, we propose to augment
the latent features of both the foreground and background images with Gaussians
for a direct denoising-based harmonization. To guarantee the fidelity of the
harmonized image, we make use of multi-scale features to enforce the
consistency of the content and stability of the foreground objects in the
latent space, and meanwhile, aligning both fore-/back-grounds with the same
style. Moreover, to accommodate the generation with more structural and
textural details, we further integrate text prompts to attend to the latent
features, hence improving the generation quality. Quantitative and qualitative
evaluations on COCO and LAION 5B datasets demonstrate that our method can
surpass representative baselines by large margins.
Related papers
- FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process [120.91393949012014]
FreeEnhance is a framework for content-consistent image enhancement using off-the-shelf image diffusion models.
In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns in the original image.
In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality.
arXiv Detail & Related papers (2024-09-11T17:58:50Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - Coherent and Multi-modality Image Inpainting via Latent Space Optimization [61.99406669027195]
PILOT (intextbfPainting vtextbfIa textbfLatent textbfOptextbfTimization) is an optimization approach grounded on a novel textitsemantic centralization and textitbackground preservation loss.
Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background.
arXiv Detail & Related papers (2024-07-10T19:58:04Z) - DiffHarmony: Latent Diffusion Model Meets Image Harmonization [11.500358677234939]
Diffusion models have promoted the rapid development of image-to-image translation tasks.
Fine-tuning pre-trained latent diffusion models from scratch is computationally intensive.
In this paper, we adapt a pre-trained latent diffusion model to the image harmonization task to generate harmonious but potentially blurry initial images.
arXiv Detail & Related papers (2024-04-09T09:05:23Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - Image Harmonization with Region-wise Contrastive Learning [51.309905690367835]
We propose a novel image harmonization framework with external style fusion and region-wise contrastive learning scheme.
Our method attempts to bring together corresponding positive and negative samples by maximizing the mutual information between the foreground and background styles.
arXiv Detail & Related papers (2022-05-27T15:46:55Z) - SCS-Co: Self-Consistent Style Contrastive Learning for Image
Harmonization [29.600429707123645]
We propose a self-consistent style contrastive learning scheme (SCS-Co) for image harmonization.
By dynamically generating multiple negative samples, our SCS-Co can learn more distortion knowledge and well regularize the generated harmonized image.
In addition, we propose a background-attentional adaptive instance normalization (BAIN) to achieve an attention-weighted background feature distribution.
arXiv Detail & Related papers (2022-04-29T09:22:01Z) - SSH: A Self-Supervised Framework for Image Harmonization [97.16345684998788]
We propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free" natural images without being edited.
Our results show that the proposedSSH outperforms previous state-of-the-art methods in terms of reference metrics, visual quality, and subject user study.
arXiv Detail & Related papers (2021-08-15T19:51:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.