Related papers: High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images

High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images

URL: http://arxiv.org/abs/2006.15031v1
Date: Fri, 26 Jun 2020 15:00:04 GMT
Title: High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images
Authors: Stephan J. Garbin, Marek Kowalski, Matthew Johnson, and Jamie Shotton
Abstract summary: We propose an algorithm that matches a non-photorealistic, synthetically generated image to a latent vector of a pretrained StyleGAN2 model. In contrast to most previous work, we require no synthetic training data. This is the first algorithm of its kind to work at a resolution of 1K and represents a significant leap forward in visual realism.
Score: 10.03187850132035
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating photorealistic images of human faces at scale remains a prohibitively difficult task using computer graphics approaches. This is because these require the simulation of light to be photorealistic, which in turn requires physically accurate modelling of geometry, materials, and light sources, for both the head and the surrounding scene. Non-photorealistic renders however are increasingly easy to produce. In contrast to computer graphics approaches, generative models learned from more readily available 2D image data have been shown to produce samples of human faces that are hard to distinguish from real data. The process of learning usually corresponds to a loss of control over the shape and appearance of the generated images. For instance, even simple disentangling tasks such as modifying the hair independently of the face, which is trivial to accomplish in a computer graphics approach, remains an open research question. In this work, we propose an algorithm that matches a non-photorealistic, synthetically generated image to a latent vector of a pretrained StyleGAN2 model which, in turn, maps the vector to a photorealistic image of a person of the same pose, expression, hair, and lighting. In contrast to most previous work, we require no synthetic training data. To the best of our knowledge, this is the first algorithm of its kind to work at a resolution of 1K and represents a significant leap forward in visual realism.

Related papers

Toward Human Understanding with Controllable Synthesis [3.6579002555961915]
Training methods to perform robust 3D human pose and shape estimation require diverse training images with accurate ground truth. While BEDLAM demonstrates the potential of traditional procedural graphics to generate such data, the training images are clearly synthetic. In contrast, generative image models produce highly realistic images but without ground truth.
arXiv Detail & Related papers (2024-11-13T14:54:47Z)
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time. At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements. We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z)
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing. Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z)
Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis. Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
Photorealism in Driving Simulations: Blending Generative Adversarial Image Synthesis with Rendering [0.0]
We introduce a hybrid generative neural graphics pipeline for improving the visual fidelity of driving simulations. We form 2D semantic images from 3D scenery consisting of simple object models without textures. These semantic images are then converted into photorealistic RGB images with a state-of-the-art Generative Adrial Network (GAN) trained on real-world driving scenes.
arXiv Detail & Related papers (2020-07-31T03:25:17Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
Learning Neural Light Transport [28.9247002210861]
We present an approach for learning light transport in static and dynamic 3D scenes using a neural network. We find that our model is able to produce photorealistic renderings of static and dynamic scenes.
arXiv Detail & Related papers (2020-06-05T13:26:05Z)
State of the Art on Neural Rendering [141.22760314536438]
We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. This report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence.
arXiv Detail & Related papers (2020-04-08T04:36:31Z)
Learning Inverse Rendering of Faces from Real-world Videos [52.313931830408386]
Existing methods decompose a face image into three components (albedo, normal, and illumination) by supervised training on synthetic data. We propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal. Our network is trained on both real and synthetic data, benefiting from both.
arXiv Detail & Related papers (2020-03-26T17:26:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.