CONFIG: Controllable Neural Face Image Generation
- URL: http://arxiv.org/abs/2005.02671v3
- Date: Mon, 19 Oct 2020 10:13:56 GMT
- Title: CONFIG: Controllable Neural Face Image Generation
- Authors: Marek Kowalski, Stephan J. Garbin, Virginia Estellers, Tadas
Baltru\v{s}aitis, Matthew Johnson, Jamie Shotton
- Abstract summary: ConfigNet is a neural face model that allows for controlling individual aspects of output images in meaningful ways.
Our novel method uses synthetic data to factorize the latent space into elements that correspond to the inputs of a traditional rendering pipeline.
- Score: 10.443563719622645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our ability to sample realistic natural images, particularly faces, has
advanced by leaps and bounds in recent years, yet our ability to exert
fine-tuned control over the generative process has lagged behind. If this new
technology is to find practical uses, we need to achieve a level of control
over generative networks which, without sacrificing realism, is on par with
that seen in computer graphics and character animation. To this end we propose
ConfigNet, a neural face model that allows for controlling individual aspects
of output images in semantically meaningful ways and that is a significant step
on the path towards finely-controllable neural rendering. ConfigNet is trained
on real face images as well as synthetic face renders. Our novel method uses
synthetic data to factorize the latent space into elements that correspond to
the inputs of a traditional rendering pipeline, separating aspects such as head
pose, facial expression, hair style, illumination, and many others which are
very hard to annotate in real data. The real images, which are presented to the
network without labels, extend the variety of the generated images and
encourage realism. Finally, we propose an evaluation criterion using an
attribute detection network combined with a user study and demonstrate
state-of-the-art individual control over attributes in the output images.
Related papers
- PUG: Photorealistic and Semantically Controllable Synthetic Data for
Representation Learning [31.81199165450692]
We present a new generation of interactive environments for representation learning research that offer both controllability and realism.
We use the Unreal Engine, a powerful game engine well known in the entertainment industry, to produce PUG environments and datasets for representation learning.
arXiv Detail & Related papers (2023-08-08T01:33:13Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - ImaginaryNet: Learning Object Detectors without Real Images and
Annotations [66.30908705345973]
We propose a framework to synthesize images by combining pretrained language model and text-to-image model.
With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish Imaginary-Supervised Object Detection.
Experiments show that ImaginaryNet can (i) obtain about 70% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data.
arXiv Detail & Related papers (2022-10-13T10:25:22Z) - 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling [111.98096975078158]
We introduce a style-based generative network that synthesizes in one pass all and only the required rendering samples of a neural radiance field.
We show that this model can accurately be fit to "in-the-wild" facial images of arbitrary pose and illumination, extract the facial characteristics, and be used to re-render the face in controllable conditions.
arXiv Detail & Related papers (2022-09-15T15:28:45Z) - Realistic Full-Body Anonymization with Surface-Guided GANs [7.37907896341367]
We propose a new anonymization method that generates realistic humans for in-the-wild images.
A key part of our design is to guide adversarial nets by dense pixel-to-surface correspondences between an image and a canonical 3D surface.
We demonstrate that surface guidance significantly improves image quality and diversity of samples, yielding a highly practical generator.
arXiv Detail & Related papers (2022-01-06T18:57:59Z) - VariTex: Variational Neural Face Textures [0.0]
VariTex is a method that learns a variational latent feature space of neural face textures.
To generate images of complete human heads, we propose an additive decoder that generates plausible additional details such as hair.
The resulting method can generate geometrically consistent images of novel identities allowing fine-grained control over head pose, face shape, and facial expressions.
arXiv Detail & Related papers (2021-04-13T07:47:53Z) - Style and Pose Control for Image Synthesis of Humans from a Single
Monocular View [78.6284090004218]
StylePoseGAN is a non-controllable generator to accept conditioning of pose and appearance separately.
Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts.
StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics.
arXiv Detail & Related papers (2021-02-22T18:50:47Z) - Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances.
The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z) - Learning Inverse Rendering of Faces from Real-world Videos [52.313931830408386]
Existing methods decompose a face image into three components (albedo, normal, and illumination) by supervised training on synthetic data.
We propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal.
Our network is trained on both real and synthetic data, benefiting from both.
arXiv Detail & Related papers (2020-03-26T17:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.