SeCGAN: Parallel Conditional Generative Adversarial Networks for Face
Editing via Semantic Consistency
- URL: http://arxiv.org/abs/2111.09298v1
- Date: Wed, 17 Nov 2021 18:54:58 GMT
- Title: SeCGAN: Parallel Conditional Generative Adversarial Networks for Face
Editing via Semantic Consistency
- Authors: Jiaze Sun, Binod Bhattarai, Zhixiang Chen, Tae-Kyun Kim
- Abstract summary: We propose a label-guided cGAN for editing face images utilising semantic information without the need to specify target semantic masks.
SeCGAN has two branches of generators and discriminators operating in parallel, with one trained to translate RGB images and the other for semantic masks.
Our results on CelebA and CelebA-HQ demonstrate that our approach is able to generate facial images with more accurate attributes.
- Score: 50.04141606856168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantically guided conditional Generative Adversarial Networks (cGANs) have
become a popular approach for face editing in recent years. However, most
existing methods introduce semantic masks as direct conditional inputs to the
generator and often require the target masks to perform the corresponding
translation in the RGB space. We propose SeCGAN, a novel label-guided cGAN for
editing face images utilising semantic information without the need to specify
target semantic masks. During training, SeCGAN has two branches of generators
and discriminators operating in parallel, with one trained to translate RGB
images and the other for semantic masks. To bridge the two branches in a
mutually beneficial manner, we introduce a semantic consistency loss which
constrains both branches to have consistent semantic outputs. Whilst both
branches are required during training, the RGB branch is our primary network
and the semantic branch is not needed for inference. Our results on CelebA and
CelebA-HQ demonstrate that our approach is able to generate facial images with
more accurate attributes, outperforming competitive baselines in terms of
Target Attribute Recognition Rate whilst maintaining quality metrics such as
self-supervised Fr\'{e}chet Inception Distance and Inception Score.
Related papers
- Contrastive Grouping with Transformer for Referring Image Segmentation [23.276636282894582]
We propose a mask classification framework, Contrastive Grouping with Transformer network (CGFormer)
CGFormer explicitly captures object-level information via token-based querying and grouping strategy.
Experimental results demonstrate that CGFormer outperforms state-of-the-art methods in both segmentation and generalization settings consistently and significantly.
arXiv Detail & Related papers (2023-09-02T20:53:42Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Wavelet-based Unsupervised Label-to-Image Translation [9.339522647331334]
We propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination.
We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
arXiv Detail & Related papers (2023-05-16T17:48:44Z) - Complementary Random Masking for RGB-Thermal Semantic Segmentation [63.93784265195356]
RGB-thermal semantic segmentation is a potential solution to achieve reliable semantic scene understanding in adverse weather and lighting conditions.
This paper proposes 1) a complementary random masking strategy of RGB-T images and 2) self-distillation loss between clean and masked input modalities.
We achieve state-of-the-art performance over three RGB-T semantic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-30T13:57:21Z) - Side Adapter Network for Open-Vocabulary Semantic Segmentation [69.18441687386733]
This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN)
A side network is attached to a frozen CLIP model with two branches: one for predicting mask proposals, and the other for predicting attention bias.
Our approach significantly outperforms other counterparts, with up to 18 times fewer trainable parameters and 19 times faster inference speed.
arXiv Detail & Related papers (2023-02-23T18:58:28Z) - A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation [52.911307452212256]
We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
arXiv Detail & Related papers (2021-12-08T22:06:31Z) - GANSeg: Learning to Segment by Unsupervised Hierarchical Image
Generation [16.900404701997502]
We propose a GAN-based approach that generates images conditioned on latent masks.
We show that such mask-conditioned image generation can be learned faithfully when conditioning the masks in a hierarchical manner.
It also lets us generate image-mask pairs for training a segmentation network, which outperforms the state-of-the-art unsupervised segmentation methods on established benchmarks.
arXiv Detail & Related papers (2021-12-02T07:57:56Z) - Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z) - Instance Semantic Segmentation Benefits from Generative Adversarial
Networks [13.295723883560122]
We define the problem of predicting masks as a GANs game framework.
A segmentation network generates the masks, and a discriminator network decides on the quality of the masks.
We report on cellphone recycling, autonomous driving, large-scale object detection, and medical glands.
arXiv Detail & Related papers (2020-10-26T17:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.