Hierarchical Dynamic Image Harmonization
- URL: http://arxiv.org/abs/2211.08639v3
- Date: Sat, 6 May 2023 10:04:06 GMT
- Title: Hierarchical Dynamic Image Harmonization
- Authors: Haoxing Chen and Zhangxuan Gu and Yaohui Li and Jun Lan and Changhua
Meng and Weiqiang Wang and Huaxiong Li
- Abstract summary: We propose a hierarchical dynamic network (HDNet) to adapt features from local to global view for better feature transformation in efficient image harmonization.
The proposed HDNet significantly reduces the total model parameters by more than 80% compared to previous methods.
Notably, the HDNet achieves a 4% improvement in PSNR and a 19% reduction in MSE compared to the prior state-of-the-art methods.
- Score: 15.886047676987316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image harmonization is a critical task in computer vision, which aims to
adjust the foreground to make it compatible with the background. Recent works
mainly focus on using global transformations (i.e., normalization and color
curve rendering) to achieve visual consistency. However, these models ignore
local visual consistency and their huge model sizes limit their harmonization
ability on edge devices. In this paper, we propose a hierarchical dynamic
network (HDNet) to adapt features from local to global view for better feature
transformation in efficient image harmonization. Inspired by the success of
various dynamic models, local dynamic (LD) module and mask-aware global dynamic
(MGD) module are proposed in this paper. Specifically, LD matches local
representations between the foreground and background regions based on semantic
similarities, then adaptively adjust every foreground local representation
according to the appearance of its $K$-nearest neighbor background regions. In
this way, LD can produce more realistic images at a more fine-grained level,
and simultaneously enjoy the characteristic of semantic alignment. The MGD
effectively applies distinct convolution to the foreground and background,
learning the representations of foreground and background regions as well as
their correlations to the global harmonization, facilitating local visual
consistency for the images much more efficiently. Experimental results
demonstrate that the proposed HDNet significantly reduces the total model
parameters by more than 80\% compared to previous methods, while still
attaining state-of-the-art performance on the popular iHarmony4 dataset.
Notably, the HDNet achieves a 4\% improvement in PSNR and a 19\% reduction in
MSE compared to the prior state-of-the-art methods.
Related papers
- DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Learning Global-aware Kernel for Image Harmonization [55.614444363743765]
Image harmonization aims to solve the visual inconsistency problem in composited images by adaptively adjusting the foreground pixels with the background as references.
Existing methods employ local color transformation or region matching between foreground and background, which neglects powerful proximity prior and independently distinguishes fore-/back-ground as a whole part for harmonization.
We propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.
arXiv Detail & Related papers (2023-05-19T13:49:02Z) - Local-Global Transformer Enhanced Unfolding Network for Pan-sharpening [13.593522290577512]
Pan-sharpening aims to increase the spatial resolution of the low-resolution multispectral (LrMS) image with the guidance of the corresponding panchromatic (PAN) image.
Although deep learning (DL)-based pan-sharpening methods have achieved promising performance, most of them have a two-fold deficiency.
arXiv Detail & Related papers (2023-04-28T03:34:36Z) - Efficient and Explicit Modelling of Image Hierarchies for Image
Restoration [120.35246456398738]
We propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.
Inspired by that, we propose the anchored stripe self-attention which achieves a good balance between the space and time complexity of self-attention.
Then we propose a new network architecture dubbed GRL to explicitly model image hierarchies in the Global, Regional, and Local range.
arXiv Detail & Related papers (2023-03-01T18:59:29Z) - Intra-Source Style Augmentation for Improved Domain Generalization [21.591831983223997]
We propose an intra-source style augmentation (ISSA) method to improve domain generalization in semantic segmentation.
ISSA is model-agnostic and straightforwardly applicable with CNNs and Transformers.
It is also complementary to other domain generalization techniques, e.g., it improves the recent state-of-the-art solution RobustNet by $3%$ mIoU in Cityscapes to Dark Z"urich.
arXiv Detail & Related papers (2022-10-18T21:33:25Z) - FRIH: Fine-grained Region-aware Image Harmonization [49.420765789360836]
We propose a novel global-local two stages framework for Fine-grained Region-aware Image Harmonization (FRIH)
Our algorithm achieves the best performance on iHarmony4 dataset (PSNR is 38.19 dB) with a lightweight model.
arXiv Detail & Related papers (2022-05-13T04:50:26Z) - Low Light Image Enhancement via Global and Local Context Modeling [164.85287246243956]
We introduce a context-aware deep network for low-light image enhancement.
First, it features a global context module that models spatial correlations to find complementary cues over full spatial domain.
Second, it introduces a dense residual block that captures local context with a relatively large receptive field.
arXiv Detail & Related papers (2021-01-04T09:40:54Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.