GAN-Control: Explicitly Controllable GANs
- URL: http://arxiv.org/abs/2101.02477v1
- Date: Thu, 7 Jan 2021 10:54:17 GMT
- Title: GAN-Control: Explicitly Controllable GANs
- Authors: Alon Shoshan, Nadav Bhonker, Igor Kviatkovsky, Gerard Medioni
- Abstract summary: We present a framework for training GANs with explicit control over generated images.
We are able to control the generated image by settings exact attributes such as age, pose, expression, etc.
- Score: 4.014524824655106
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a framework for training GANs with explicit control over generated
images. We are able to control the generated image by settings exact attributes
such as age, pose, expression, etc. Most approaches for editing GAN-generated
images achieve partial control by leveraging the latent space disentanglement
properties, obtained implicitly after standard GAN training. Such methods are
able to change the relative intensity of certain attributes, but not explicitly
set their values. Recently proposed methods, designed for explicit control over
human faces, harness morphable 3D face models to allow fine-grained control
capabilities in GANs. Unlike these methods, our control is not constrained to
morphable 3D face model parameters and is extendable beyond the domain of human
faces. Using contrastive learning, we obtain GANs with an explicitly
disentangled latent space. This disentanglement is utilized to train
control-encoders mapping human-interpretable inputs to suitable latent vectors,
thus allowing explicit control. In the domain of human faces we demonstrate
control over identity, age, pose, expression, hair color and illumination. We
also demonstrate control capabilities of our framework in the domains of
painted portraits and dog image generation. We demonstrate that our approach
achieves state-of-the-art performance both qualitatively and quantitatively.
Related papers
- PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models [55.080748327139176]
We introduce PerLDiff, a method for effective street view image generation that fully leverages perspective 3D geometric information.
Our results justify that our PerLDiff markedly enhances the precision of generation on the NuScenes and KITTI datasets.
arXiv Detail & Related papers (2024-07-08T16:46:47Z) - XAGen: 3D Expressive Human Avatars Generation [76.69560679209171]
XAGen is the first 3D generative model for human avatars capable of expressive control over body, face, and hands.
We propose a multi-part rendering technique that disentangles the synthesis of body, face, and hands.
Experiments show that XAGen surpasses state-of-the-art methods in terms of realism, diversity, and expressive control abilities.
arXiv Detail & Related papers (2023-11-22T18:30:42Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Text and Image Guided 3D Avatar Generation and Manipulation [0.0]
We propose a novel 3D manipulation method that can manipulate both the shape and texture of the model using text or image-based prompts such as 'a young face' or 'a surprised face'
Our method requires only 5 minutes per manipulation, and we demonstrate the effectiveness of our approach with extensive results and comparisons.
arXiv Detail & Related papers (2022-02-12T14:37:29Z) - MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design.
Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models.
It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z) - GIF: Generative Interpretable Faces [16.573491296543196]
3D face modeling methods provide parametric control but generate unrealistic images.
Recent methods gain partial control, either by attempting to disentangle different factors in an unsupervised manner, or by adding control post hoc to a pre-trained model.
We condition our generative model on pre-defined control parameters to encourage disentanglement in the generation process.
arXiv Detail & Related papers (2020-08-31T23:40:26Z) - Towards a Neural Graphics Pipeline for Controllable Image Generation [96.11791992084551]
We present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models.
NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation.
We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes.
arXiv Detail & Related papers (2020-06-18T14:22:54Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.