SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete
Attribute
- URL: http://arxiv.org/abs/2207.05300v1
- Date: Tue, 12 Jul 2022 04:23:38 GMT
- Title: SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete
Attribute
- Authors: Zhou Kangneng, Zhu Xiaobin, Gao Daiheng, Lee Kai, Li Xinjie, Yin
Xu-Cheng
- Abstract summary: We propose an innovative framework to tackle challenging facial discrete attribute synthesis via semantic decomposing, dubbed SD-GAN.
The fusion network integrates 3D embedding for better identity preservation and discrete attribute synthesis.
We construct a large and valuable dataset MEGN for completing the lack of discrete attributes in the existing dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Manipulating latent code in generative adversarial networks (GANs) for facial
image synthesis mainly focuses on continuous attribute synthesis (e.g., age,
pose and emotion), while discrete attribute synthesis (like face mask and
eyeglasses) receives less attention. Directly applying existing works to facial
discrete attributes may cause inaccurate results. In this work, we propose an
innovative framework to tackle challenging facial discrete attribute synthesis
via semantic decomposing, dubbed SD-GAN. To be concrete, we explicitly
decompose the discrete attribute representation into two components, i.e. the
semantic prior basis and offset latent representation. The semantic prior basis
shows an initializing direction for manipulating face representation in the
latent space. The offset latent presentation obtained by 3D-aware semantic
fusion network is proposed to adjust prior basis. In addition, the fusion
network integrates 3D embedding for better identity preservation and discrete
attribute synthesis. The combination of prior basis and offset latent
representation enable our method to synthesize photo-realistic face images with
discrete attributes. Notably, we construct a large and valuable dataset MEGN
(Face Mask and Eyeglasses images crawled from Google and Naver) for completing
the lack of discrete attributes in the existing dataset. Extensive qualitative
and quantitative experiments demonstrate the state-of-the-art performance of
our method. Our code is available at: https://github.com/MontaEllis/SD-GAN.
Related papers
- Analyzing the Feature Extractor Networks for Face Image Synthesis [0.0]
This study investigates the behavior of diverse feature extractors -- InceptionV3, CLIP, DINOv2, and ArcFace -- considering a variety of metrics -- FID, KID, Precision&Recall.
Experiments include deep-down analysis of the features: $L$ normalization, model attention during extraction, and domain distributions in the feature space.
arXiv Detail & Related papers (2024-06-04T09:41:40Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - Extracting Semantic Knowledge from GANs with Unsupervised Learning [65.32631025780631]
Generative Adversarial Networks (GANs) encode semantics in feature maps in a linearly separable form.
We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features.
KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects.
arXiv Detail & Related papers (2022-11-30T03:18:16Z) - One-Shot Synthesis of Images and Segmentation Masks [28.119303696418882]
Joint synthesis of images and segmentation masks with generative adversarial networks (GANs) is promising to reduce the effort needed for collecting image data with pixel-wise annotations.
To learn high-fidelity image-mask synthesis, existing GAN approaches first need a pre-training phase requiring large amounts of image data.
We introduce our OSMIS model which enables the synthesis of segmentation masks that are precisely aligned to the generated images in the one-shot regime.
arXiv Detail & Related papers (2022-09-15T18:00:55Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - USIS: Unsupervised Semantic Image Synthesis [9.613134538472801]
We propose a new Unsupervised paradigm for Semantic Image Synthesis (USIS)
USIS learns to output images with visually separable semantic classes using a self-supervised segmentation loss.
In order to match the color and texture distribution of real images without losing high-frequency information, we propose to use whole image wavelet-based discrimination.
arXiv Detail & Related papers (2021-09-29T20:48:41Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z) - You Only Need Adversarial Supervision for Semantic Image Synthesis [84.83711654797342]
We propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results.
We show that images synthesized by our model are more diverse and follow the color and texture of real images more closely.
arXiv Detail & Related papers (2020-12-08T23:00:48Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.