SemanticStyleGAN: Learning Compositional Generative Priors for
Controllable Image Synthesis and Editing
- URL: http://arxiv.org/abs/2112.02236v2
- Date: Tue, 7 Dec 2021 09:38:43 GMT
- Title: SemanticStyleGAN: Learning Compositional Generative Priors for
Controllable Image Synthesis and Editing
- Authors: Yichun Shi, Xiao Yang, Yangyue Wan, Xiaohui Shen
- Abstract summary: StyleGANs provide promising prior models for downstream tasks on image synthesis and editing.
We present SemanticStyleGAN, where a generator is trained to model local semantic parts separately and synthesizes images in a compositional way.
- Score: 35.02841064647306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent studies have shown that StyleGANs provide promising prior models for
downstream tasks on image synthesis and editing. However, since the latent
codes of StyleGANs are designed to control global styles, it is hard to achieve
a fine-grained control over synthesized images. We present SemanticStyleGAN,
where a generator is trained to model local semantic parts separately and
synthesizes images in a compositional way. The structure and texture of
different local parts are controlled by corresponding latent codes.
Experimental results demonstrate that our model provides a strong
disentanglement between different spatial areas. When combined with editing
methods designed for StyleGANs, it can achieve a more fine-grained control to
edit synthesized or real images. The model can also be extended to other
domains via transfer learning. Thus, as a generic prior model with built-in
disentanglement, it could facilitate the development of GAN-based applications
and enable more potential downstream tasks.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis [104.26279487968839]
This work presents an easy-to-use regularizer for GAN training.
It helps explicitly link some axes of the latent space to a set of pixels in the synthesized image.
arXiv Detail & Related papers (2023-01-11T17:56:36Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ
Histopathology Image Synthesis [63.845552349914186]
We propose a style-guided instance-adaptive normalization (SIAN) to synthesize realistic color distributions and textures for different organs.
The four phases work together and are integrated into a generative network to embed image semantics, style, and instance-level boundaries.
arXiv Detail & Related papers (2022-09-02T16:45:46Z) - Delta-GAN-Encoder: Encoding Semantic Changes for Explicit Image Editing,
using Few Synthetic Samples [2.348633570886661]
We propose a novel method for learning to control any desired attribute in a pre-trained GAN's latent space.
We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits.
arXiv Detail & Related papers (2021-11-16T12:42:04Z) - StyleFusion: A Generative Model for Disentangling Spatial Segments [41.35834479560669]
We present StyleFusion, a new mapping architecture for StyleGAN.
StyleFusion takes as input a number of latent codes and fuses them into a single style code.
It provides fine-grained control over each region of the generated image.
arXiv Detail & Related papers (2021-07-15T16:35:21Z) - Decorating Your Own Bedroom: Locally Controlling Image Generation with
Generative Adversarial Networks [15.253043666814413]
We propose an effective approach, termed as LoGAN, to support local editing of the output image.
We are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room.
Our method can completely clear out a room and then refurnish it with customized furniture and styles.
arXiv Detail & Related papers (2021-05-18T01:31:49Z) - Navigating the GAN Parameter Space for Semantic Image Editing [35.622710993417456]
Generative Adversarial Networks (GANs) are an indispensable tool for visual editing.
In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2.
arXiv Detail & Related papers (2020-11-27T15:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.