Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis
- URL: http://arxiv.org/abs/2203.16898v1
- Date: Thu, 31 Mar 2022 09:06:04 GMT
- Title: Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis
- Authors: Zhengyao Lv, Xiaoming Li, Zhenxing Niu, Bing Cao, Wangmeng Zuo
- Abstract summary: A fine-grained part-level semantic layout will benefit object details generation.
A Shape-aware Position Descriptor (SPD) is proposed to describe each pixel's positional feature.
A Semantic-shape Adaptive Feature Modulation (SAFM) block is proposed to combine the given semantic map and our positional features.
- Score: 71.56830815617553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have witnessed substantial progress in semantic image synthesis,
it is still challenging in synthesizing photo-realistic images with rich
details. Most previous methods focus on exploiting the given semantic map,
which just captures an object-level layout for an image. Obviously, a
fine-grained part-level semantic layout will benefit object details generation,
and it can be roughly inferred from an object's shape. In order to exploit the
part-level layouts, we propose a Shape-aware Position Descriptor (SPD) to
describe each pixel's positional feature, where object shape is explicitly
encoded into the SPD feature. Furthermore, a Semantic-shape Adaptive Feature
Modulation (SAFM) block is proposed to combine the given semantic map and our
positional features to produce adaptively modulated features. Extensive
experiments demonstrate that the proposed SPD and SAFM significantly improve
the generation of objects with rich details. Moreover, our method performs
favorably against the SOTA methods in terms of quantitative and qualitative
evaluation. The source code and model are available at
https://github.com/cszy98/SAFM.
Related papers
- PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis [62.29033292210752]
High-quality images with consistent semantics and layout remains a challenge.
We propose the adaPtive LAyout-semantiC fusion modulE (PLACE) that harnesses pre-trained models to alleviate the aforementioned issues.
Our approach performs favorably in terms of visual quality, semantic consistency, and layout alignment.
arXiv Detail & Related papers (2024-03-04T09:03:16Z) - Semantic Lens: Instance-Centric Semantic Alignment for Video
Super-Resolution [36.48329560039897]
inter-frame alignment is a critical clue of video super-resolution (VSR)
We introduce a novel paradigm for VSR named Semantic Lens.
Video is modeled as instances, events, and scenes via a Semantic Extractor.
arXiv Detail & Related papers (2023-12-13T01:16:50Z) - SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form
Layout-to-Image Generation [68.42476385214785]
We propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance.
SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to previous works.
We also propose the Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA) mechanisms.
arXiv Detail & Related papers (2023-08-20T04:09:12Z) - Inferring and Leveraging Parts from Object Shape for Improving Semantic
Image Synthesis [64.05076727277431]
This paper presents to infer Parts from Object ShapE (iPOSE) and leverage it for improving semantic image synthesis.
We learn a PartNet for predicting the object part map with the guidance of pre-defined support part maps.
Experiments show that our iPOSE not only generates objects with rich part details, but also enables to control the image synthesis flexibly.
arXiv Detail & Related papers (2023-05-31T04:27:47Z) - Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis [68.1281982092765]
We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
arXiv Detail & Related papers (2022-04-06T14:21:39Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.