Retrieval Augmented Image Harmonization
- URL: http://arxiv.org/abs/2412.13916v1
- Date: Wed, 18 Dec 2024 14:56:03 GMT
- Title: Retrieval Augmented Image Harmonization
- Authors: Haolin Wang, Ming Liu, Zifei Yan, Chao Zhou, Longan Xiao, Wangmeng Zuo,
- Abstract summary: This paper presents a retrieval-augmented image harmonization (Raiha) framework.
It seeks proper reference images to reduce the ill-posedness and restricts the attention to better utilize the useful information.
The Raiha framework is greatly boosted under both non-reference and retrieval-augmented settings.
- Score: 47.377530051943275
- License:
- Abstract: When embedding objects (foreground) into images (background), considering the influence of photography conditions like illumination, it is usually necessary to perform image harmonization to make the foreground object coordinate with the background image in terms of brightness, color, and etc. Although existing image harmonization methods have made continuous efforts toward visually pleasing results, they are still plagued by two main issues. Firstly, the image harmonization becomes highly ill-posed when there are no contents similar to the foreground object in the background, making the harmonization results unreliable. Secondly, even when similar contents are available, the harmonization process is often interfered with by irrelevant areas, mainly attributed to an insufficient understanding of image contents and inaccurate attention. As a remedy, we present a retrieval-augmented image harmonization (Raiha) framework, which seeks proper reference images to reduce the ill-posedness and restricts the attention to better utilize the useful information. Specifically, an efficient retrieval method is designed to find reference images that contain similar objects as the foreground while the illumination is consistent with the background. For training the Raiha framework to effectively utilize the reference information, a data augmentation strategy is delicately designed by leveraging existing non-reference image harmonization datasets. Besides, the image content priors are introduced to ensure reasonable attention. With the presented Raiha framework, the image harmonization performance is greatly boosted under both non-reference and retrieval-augmented settings. The source code and pre-trained models will be publicly available.
Related papers
- Consistent Human Image and Video Generation with Spatially Conditioned Diffusion [82.4097906779699]
Consistent human-centric image and video synthesis aims to generate images with new poses while preserving appearance consistency with a given reference image.
We frame the task as a spatially-conditioned inpainting problem, where the target image is in-painted to maintain appearance consistency with the reference.
This approach enables the reference features to guide the generation of pose-compliant targets within a unified denoising network.
arXiv Detail & Related papers (2024-12-19T05:02:30Z) - Learning Flow Fields in Attention for Controllable Person Image Generation [59.10843756343987]
Controllable person image generation aims to generate a person image conditioned on reference images.
We propose learning flow fields in attention (Leffa), which explicitly guides the target query to attend to the correct reference key.
Leffa achieves state-of-the-art performance in controlling appearance (virtual try-on) and pose (pose transfer), significantly reducing fine-grained detail distortion.
arXiv Detail & Related papers (2024-12-11T15:51:14Z) - Intrinsic Harmonization for Illumination-Aware Compositing [0.7366405857677227]
We introduce a self-supervised illumination harmonization approach formulated in the intrinsic image domain.
First, we estimate a simple global lighting model from mid-level vision representations to generate a rough shading for the foreground region.
A network then refines this inferred shading to generate a re-shading that aligns with the background scene.
arXiv Detail & Related papers (2023-12-06T18:59:03Z) - Image Harmonization with Region-wise Contrastive Learning [51.309905690367835]
We propose a novel image harmonization framework with external style fusion and region-wise contrastive learning scheme.
Our method attempts to bring together corresponding positive and negative samples by maximizing the mutual information between the foreground and background styles.
arXiv Detail & Related papers (2022-05-27T15:46:55Z) - Interactive Portrait Harmonization [99.15331091722231]
Current image harmonization methods consider the entire background as the guidance for harmonization.
A new flexible framework that allows users to pick certain regions of the background image and use it to guide the harmonization is proposed.
Inspired by professional portrait harmonization users, we also introduce a new luminance matching loss to optimally match the color/luminance conditions between the composite foreground and select reference region.
arXiv Detail & Related papers (2022-03-15T19:30:34Z) - SSH: A Self-Supervised Framework for Image Harmonization [97.16345684998788]
We propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free" natural images without being edited.
Our results show that the proposedSSH outperforms previous state-of-the-art methods in terms of reference metrics, visual quality, and subject user study.
arXiv Detail & Related papers (2021-08-15T19:51:33Z) - Region-aware Adaptive Instance Normalization for Image Harmonization [14.77918186672189]
To acquire photo-realistic composite images, one must adjust the appearance and visual style of the foreground to be compatible with the background.
Existing deep learning methods for harmonizing composite images directly learn an image mapping network from the composite to the real one.
We propose a Region-aware Adaptive Instance Normalization (RAIN) module, which explicitly formulates the visual style from the background and adaptively applies them to the foreground.
arXiv Detail & Related papers (2021-06-05T09:57:17Z) - BargainNet: Background-Guided Domain Translation for Image Harmonization [26.370523451625466]
Unharmonious foreground and background downgrade the quality of composite image.
Image harmonization, which adjusts the foreground to improve the consistency, is an essential yet challenging task.
We propose an image harmonization network with a novel domain code extractor and well-tailored triplet losses.
arXiv Detail & Related papers (2020-09-19T05:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.