Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis
- URL: http://arxiv.org/abs/2204.02854v1
- Date: Wed, 6 Apr 2022 14:21:39 GMT
- Title: Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis
- Authors: Yupeng Shi, Xiao Liu, Yuxiang Wei, Zhongqin Wu and Wangmeng Zuo
- Abstract summary: We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
- Score: 68.1281982092765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic image synthesis is a challenging task with many practical
applications. Albeit remarkable progress has been made in semantic image
synthesis with spatially-adaptive normalization and existing methods normalize
the feature activations under the coarse-level guidance (e.g., semantic class).
However, different parts of a semantic object (e.g., wheel and window of car)
are quite different in structures and textures, making blurry synthesis results
usually inevitable due to the missing of fine-grained guidance. In this paper,
we propose a novel normalization module, termed as REtrieval-based Spatially
AdaptIve normaLization (RESAIL), for introducing pixel level fine-grained
guidance to the normalization architecture. Specifically, we first present a
retrieval paradigm by finding a content patch of the same semantic class from
training set with the most similar shape to each test semantic mask. Then,
RESAIL is presented to use the retrieved patch for guiding the feature
normalization of corresponding region, and can provide pixel level fine-grained
guidance, thereby greatly mitigating blurry synthesis results. Moreover,
distorted ground-truth images are also utilized as alternatives of
retrieval-based guidance for feature normalization, further benefiting model
training and improving visual quality of generated images. Experiments on
several challenging datasets show that our RESAIL performs favorably against
state-of-the-arts in terms of quantitative metrics, visual quality, and
subjective evaluation. The source code and pre-trained models will be publicly
available.
Related papers
- PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis [62.29033292210752]
High-quality images with consistent semantics and layout remains a challenge.
We propose the adaPtive LAyout-semantiC fusion modulE (PLACE) that harnesses pre-trained models to alleviate the aforementioned issues.
Our approach performs favorably in terms of visual quality, semantic consistency, and layout alignment.
arXiv Detail & Related papers (2024-03-04T09:03:16Z) - Segment Anything Model Meets Image Harmonization [13.415810438244788]
Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images.
Current methods adopt either global-level or pixel-level feature matching.
We propose Semantic-guided Region-aware Instance Normalization (SRIN) that can utilize the semantic segmentation maps output by a pre-trained Segment Anything Model (SAM) to guide the visual consistency learning of foreground and background features.
arXiv Detail & Related papers (2023-12-20T02:57:21Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - Semantically Adaptive Image-to-image Translation for Domain Adaptation
of Semantic Segmentation [1.8275108630751844]
We address the problem of domain adaptation for semantic segmentation of street scenes.
Many state-of-the-art approaches focus on translating the source image while imposing that the result should be semantically consistent with the input.
We advocate that the image semantics can also be exploited to guide the translation algorithm.
arXiv Detail & Related papers (2020-09-02T16:16:50Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Panoptic-based Image Synthesis [32.82903428124024]
Conditional image synthesis serves various applications for content editing to content generation.
We propose a panoptic aware image synthesis network to generate high fidelity and photorealistic images conditioned on panoptic maps.
arXiv Detail & Related papers (2020-04-21T20:40:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.