Identity-Guided Face Generation with Multi-modal Contour Conditions
- URL: http://arxiv.org/abs/2110.04854v1
- Date: Sun, 10 Oct 2021 17:08:22 GMT
- Title: Identity-Guided Face Generation with Multi-modal Contour Conditions
- Authors: Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang
- Abstract summary: We propose a framework that takes the contour and an extra image specifying the identity as the inputs.
An identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information.
Our method can produce photo-realistic results with 1024$times$1024 resolution.
- Score: 15.84849740726513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent face generation methods have tried to synthesize faces based on the
given contour condition, like a low-resolution image or a sketch. However, the
problem of identity ambiguity remains unsolved, which usually occurs when the
contour is too vague to provide reliable identity information (e.g., when its
resolution is extremely low). In this work, we propose a framework that takes
the contour and an extra image specifying the identity as the inputs, where the
contour can be of various modalities, including the low-resolution image,
sketch, and semantic label map. This task especially fits the situation of
tracking the known criminals or making intelligent creations for entertainment.
Concretely, we propose a novel dual-encoder architecture, in which an identity
encoder extracts the identity-related feature, accompanied by a main encoder to
obtain the rough contour information and further fuse all the information
together. The encoder output is iteratively fed into a pre-trained StyleGAN
generator until getting a satisfying result. To the best of our knowledge, this
is the first work that achieves identity-guided face generation conditioned on
multi-modal contour images. Moreover, our method can produce photo-realistic
results with 1024$\times$1024 resolution. Code will be available at
https://git.io/Jo4yh.
Related papers
- Fusion is all you need: Face Fusion for Customized Identity-Preserving Image Synthesis [7.099258248662009]
Text-to-image (T2I) models have significantly advanced the development of artificial intelligence.
However, existing T2I-based methods often struggle to accurately reproduce the appearance of individuals from a reference image.
We leverage the pre-trained UNet from Stable Diffusion to incorporate the target face image directly into the generation process.
arXiv Detail & Related papers (2024-09-27T19:31:04Z) - G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives.
This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation.
Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - HFORD: High-Fidelity and Occlusion-Robust De-identification for Face
Privacy Protection [60.63915939982923]
Face de-identification is a practical way to solve the identity protection problem.
The existing facial de-identification methods have revealed several problems.
We present a High-Fidelity and Occlusion-Robust De-identification (HFORD) method to deal with these issues.
arXiv Detail & Related papers (2023-11-15T08:59:02Z) - Semantics-Guided Object Removal for Facial Images: with Broad
Applicability and Robust Style Preservation [29.162655333387452]
Object removal and image inpainting in facial images is a task in which objects that occlude a facial image are specifically targeted, removed, and replaced by a properly reconstructed facial image.
Two different approaches utilizing U-net and modulated generator respectively have been widely endorsed for this task for their unique advantages but notwithstanding each method's innate disadvantages.
Here, we propose Semantics-Guided Inpainting Network (SGIN) which itself is a modification of the modulated generator, aiming to take advantage of its advanced generative capability and preserve the high-fidelity details of the original image.
arXiv Detail & Related papers (2022-09-29T00:09:12Z) - High-resolution Face Swapping via Latent Semantics Disentanglement [50.23624681222619]
We present a novel high-resolution hallucination face swapping method using the inherent prior knowledge of a pre-trained GAN model.
We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator.
We extend our method to video face swapping by enforcing two-temporal constraints on the latent space and the image space.
arXiv Detail & Related papers (2022-03-30T00:33:08Z) - Learning Disentangled Representation for One-shot Progressive Face
Swapping [65.98684203654908]
We present a simple yet efficient method named FaceSwapper, for one-shot face swapping based on Generative Adversarial Networks.
Our method consists of a disentangled representation module and a semantic-guided fusion module.
Our results show that our method achieves state-of-the-art results on benchmark with fewer training samples.
arXiv Detail & Related papers (2022-03-24T11:19:04Z) - ShapeEditer: a StyleGAN Encoder for Face Swapping [6.848723869850855]
We propose a novel encoder, called ShapeEditor, for high-resolution, realistic and high-fidelity face exchange.
Our key idea is to use an advanced pretrained high-quality random face image generator, i.e. StyleGAN, as backbone.
For learning to map into the latent space of StyleGAN, we propose a set of self-supervised loss functions.
arXiv Detail & Related papers (2021-06-26T09:38:45Z) - Realistic Face Reenactment via Self-Supervised Disentangling of Identity
and Pose [23.211318473026243]
We propose a self-supervised hybrid model (DAE-GAN) that learns how to reenact face naturally given large amounts of unlabeled videos.
Our approach combines two deforming autoencoders with the latest advances in the conditional generation.
Experiment results demonstrate the superior quality of reenacted images and the flexibility of transferring facial movements between identities.
arXiv Detail & Related papers (2020-03-29T06:45:17Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.