Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative
Adversarial Network with Graph Representation Learning
- URL: http://arxiv.org/abs/2201.01592v2
- Date: Sun, 3 Dec 2023 07:15:16 GMT
- Title: Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative
Adversarial Network with Graph Representation Learning
- Authors: Xingqun Qi, Muyi Sun, Zijian Wang, Jiaming Liu, Qi Li, Fang Zhao,
Shanghang Zhang, Caifeng Shan
- Abstract summary: We propose a novel Semantic-Driven Generative Adversarial Network to address the above issues.
Considering that human faces have distinct spatial structures, we first inject class-wise semantic layouts into the generator.
We construct two types of representational graphs via semantic parsing maps upon input faces, dubbed the IntrA-class Semantic Graph (IASG) and the InteR-class Structure Graph (IRSG)
- Score: 40.544844623958426
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Biphasic face photo-sketch synthesis has significant practical value in
wide-ranging fields such as digital entertainment and law enforcement. Previous
approaches directly generate the photo-sketch in a global view, they always
suffer from the low quality of sketches and complex photo variations, leading
to unnatural and low-fidelity results. In this paper, we propose a novel
Semantic-Driven Generative Adversarial Network to address the above issues,
cooperating with Graph Representation Learning. Considering that human faces
have distinct spatial structures, we first inject class-wise semantic layouts
into the generator to provide style-based spatial information for synthesized
face photos and sketches. Additionally, to enhance the authenticity of details
in generated faces, we construct two types of representational graphs via
semantic parsing maps upon input faces, dubbed the IntrA-class Semantic Graph
(IASG) and the InteR-class Structure Graph (IRSG). Specifically, the IASG
effectively models the intra-class semantic correlations of each facial
semantic component, thus producing realistic facial details. To preserve the
generated faces being more structure-coordinated, the IRSG models inter-class
structural relations among every facial component by graph representation
learning. To further enhance the perceptual quality of synthesized images, we
present a biphasic interactive cycle training strategy by fully taking
advantage of the multi-level feature consistency between the photo and sketch.
Extensive experiments demonstrate that our method outperforms the
state-of-the-art competitors on the CUFS and CUFSF datasets.
Related papers
- Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images [14.836487514037994]
Sparse and noisy images (SNIs) pose significant challenges for effective representation learning and clustering.
We propose Dual Advancement of Representation Learning and Clustering (DARLC) to enhance the representations derived from masked image modeling.
Our framework offers a comprehensive approach that improves the learning of representations by enhancing their local perceptibility, distinctiveness, and the understanding of relational semantics.
arXiv Detail & Related papers (2024-09-03T10:52:27Z) - Masked Contrastive Graph Representation Learning for Age Estimation [44.96502862249276]
This paper utilizes the property of graph representation learning in dealing with image redundancy information.
We propose a novel Masked Contrastive Graph Representation Learning (MCGRL) method for age estimation.
Experimental results on real-world face image datasets demonstrate the superiority of our proposed method over other state-of-the-art age estimation approaches.
arXiv Detail & Related papers (2023-06-16T15:53:21Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - General Facial Representation Learning in a Visual-Linguistic Manner [45.92447707178299]
We introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner.
We show that FaRL achieves better transfer performance compared with previous pre-trained models.
Our model surpasses the state-of-the-art methods on face analysis tasks including face parsing and face alignment.
arXiv Detail & Related papers (2021-12-06T15:22:05Z) - Enhancing Social Relation Inference with Concise Interaction Graph and
Discriminative Scene Representation [56.25878966006678]
We propose an approach of textbfPRactical textbfInference in textbfSocial rtextbfElation (PRISE)
It concisely learns interactive features of persons and discriminative features of holistic scenes.
PRISE achieves 6.8$%$ improvement for domain classification in PIPA dataset.
arXiv Detail & Related papers (2021-07-30T04:20:13Z) - Face Sketch Synthesis via Semantic-Driven Generative Adversarial Network [10.226808267718523]
We propose a novel Semantic-Driven Generative Adrial Network (SDGAN) which embeds global structure-level style injection and local class-level knowledge re-weighting.
Specifically, we conduct facial saliency detection on the input face photos to provide overall facial texture structure.
In addition, we exploit face parsing layouts as the semantic-level spatial prior to enforce globally structural style injection in the generator of SDGAN.
arXiv Detail & Related papers (2021-06-29T07:03:56Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z) - Hierarchy Composition GAN for High-fidelity Image Synthesis [57.32311953820988]
This paper presents an innovative Hierarchical Composition GAN (HIC-GAN)
HIC-GAN incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network.
Experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.
arXiv Detail & Related papers (2019-05-12T11:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.