A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation
- URL: http://arxiv.org/abs/2112.04603v1
- Date: Wed, 8 Dec 2021 22:06:31 GMT
- Title: A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation
- Authors: Rumeysa Bodur, Binod Bhattarai, Tae-Kyun Kim
- Abstract summary: We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
- Score: 52.911307452212256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Editing facial expressions by only changing what we want is a long-standing
research problem in Generative Adversarial Networks (GANs) for image
manipulation. Most of the existing methods that rely only on a global generator
usually suffer from changing unwanted attributes along with the target
attributes. Recently, hierarchical networks that consist of both a global
network dealing with the whole image and multiple local networks focusing on
local parts are showing success. However, these methods extract local regions
by bounding boxes centred around the sparse facial key points which are
non-differentiable, inaccurate and unrealistic. Hence, the solution becomes
sub-optimal, introduces unwanted artefacts degrading the overall quality of the
synthetic images. Moreover, a recent study has shown strong correlation between
facial attributes and local semantic regions. To exploit this relationship, we
designed a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic
segmentation network conditions the generative model, and on backward pass
gradients from hierarchical GANs are propagated to the semantic segmentation
network, which makes our framework an end-to-end differentiable architecture.
This allows both architectures to benefit from each other. To demonstrate its
advantages, we evaluate our method on two challenging facial expression
translation benchmarks, AffectNet and RaFD, and a semantic segmentation
benchmark, CelebAMask-HQ across two popular architectures, BiSeNet and UNet.
Our extensive quantitative and qualitative evaluations on both face semantic
segmentation and face expression manipulation tasks validate the effectiveness
of our work over existing state-of-the-art methods.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Prompting Diffusion Representations for Cross-Domain Semantic
Segmentation [101.04326113360342]
diffusion-pretraining achieves extraordinary domain generalization results for semantic segmentation.
We introduce a scene prompt and a prompt randomization strategy to help further disentangle the domain-invariant information when training the segmentation head.
arXiv Detail & Related papers (2023-07-05T09:28:25Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Unsupervised Domain Adaptation for Semantic Segmentation using One-shot
Image-to-Image Translation via Latent Representation Mixing [9.118706387430883]
We propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images.
An image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains.
Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2022-12-07T18:16:17Z) - Framework-agnostic Semantically-aware Global Reasoning for Segmentation [29.69187816377079]
We propose a component that learns to project image features into latent representations and reason between them.
Our design encourages the latent regions to represent semantic concepts by ensuring that the activated regions are spatially disjoint.
Our latent tokens are semantically interpretable and diverse and provide a rich set of features that can be transferred to downstream tasks.
arXiv Detail & Related papers (2022-12-06T21:42:05Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - More Separable and Easier to Segment: A Cluster Alignment Method for
Cross-Domain Semantic Segmentation [41.81843755299211]
We propose a new UDA semantic segmentation approach based on domain assumption closeness to alleviate the above problems.
Specifically, a prototype clustering strategy is applied to cluster pixels with the same semantic, which will better maintain associations among target domain pixels.
Experiments conducted on GTA5 and SYNTHIA proved the effectiveness of our method.
arXiv Detail & Related papers (2021-05-07T10:24:18Z) - Affinity Space Adaptation for Semantic Segmentation Across Domains [57.31113934195595]
In this paper, we address the problem of unsupervised domain adaptation (UDA) in semantic segmentation.
Motivated by the fact that source and target domain have invariant semantic structures, we propose to exploit such invariance across domains.
We develop two affinity space adaptation strategies: affinity space cleaning and adversarial affinity space alignment.
arXiv Detail & Related papers (2020-09-26T10:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.