Controlling the Output of a Generative Model by Latent Feature Vector
Shifting
- URL: http://arxiv.org/abs/2311.08850v2
- Date: Mon, 26 Feb 2024 19:34:51 GMT
- Title: Controlling the Output of a Generative Model by Latent Feature Vector
Shifting
- Authors: R\'obert Belanec, Peter Lacko, Krist\'ina Malinovsk\'a
- Abstract summary: We present our novel method for latent vector shifting for controlled output image modification.
In our approach we use a pre-trained model of StyleGAN3 that generates images of realistic human faces.
Our latent feature shifter is a neural network model with a task to shift the latent vectors of a generative model into a specified feature direction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art generative models (e.g. StyleGAN3 \cite{karras2021alias})
often generate photorealistic images based on vectors sampled from their latent
space. However, the ability to control the output is limited. Here we present
our novel method for latent vector shifting for controlled output image
modification utilizing semantic features of the generated images. In our
approach we use a pre-trained model of StyleGAN3 that generates images of
realistic human faces in relatively high resolution. We complement the
generative model with a convolutional neural network classifier, namely
ResNet34, trained to classify the generated images with binary facial features
from the CelebA dataset. Our latent feature shifter is a neural network model
with a task to shift the latent vectors of a generative model into a specified
feature direction. We have trained latent feature shifter for multiple facial
features, and outperformed our baseline method in the number of generated
images with the desired feature. To train our latent feature shifter neural
network, we have designed a dataset of pairs of latent vectors with and without
a certain feature. Based on the evaluation, we conclude that our latent feature
shifter approach was successful in the controlled generation of the StyleGAN3
generator.
Related papers
- How to Trace Latent Generative Model Generated Images without Artificial Watermark? [88.04880564539836]
Concerns have arisen regarding potential misuse related to images generated by latent generative models.
We propose a latent inversion based method called LatentTracer to trace the generated images of the inspected model.
Our experiments show that our method can distinguish the images generated by the inspected model and other images with a high accuracy and efficiency.
arXiv Detail & Related papers (2024-05-22T05:33:47Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - 3D Generative Model Latent Disentanglement via Local Eigenprojection [13.713373496487012]
We introduce a novel loss function grounded in spectral geometry for different neural-network-based generative models of 3D head and body meshes.
Experimental results show that our local eigenprojection disentangled (LED) models offer improved disentanglement with respect to the state-of-the-art.
arXiv Detail & Related papers (2023-02-24T18:19:49Z) - 3DShape2VecSet: A 3D Shape Representation for Neural Fields and
Generative Diffusion Models [42.928400751670935]
We introduce 3DShape2VecSet, a novel shape representation for neural fields designed for generative diffusion models.
Our results show improved performance in 3D shape encoding and 3D shape generative modeling tasks.
arXiv Detail & Related papers (2023-01-26T22:23:03Z) - Generative Visual Prompt: Unifying Distributional Control of Pre-Trained
Generative Models [77.47505141269035]
Generative Visual Prompt (PromptGen) is a framework for distributional control over pre-trained generative models.
PromptGen approximats an energy-based model (EBM) and samples images in a feed-forward manner.
Code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
arXiv Detail & Related papers (2022-09-14T22:55:18Z) - Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation [66.21121745446345]
We propose a conditional GNeRF model that integrates specific attribute labels as input, thus amplifying the controllability and disentanglement capabilities of 3D-aware generative models.
Our approach builds upon a pre-trained 3D-aware face model, and we introduce a Training as Init and fidelity for Tuning (TRIOT) method to train a conditional normalized flow module.
Our experiments substantiate the efficacy of our model, showcasing its ability to generate high-quality edits with enhanced view consistency.
arXiv Detail & Related papers (2022-08-26T10:05:39Z) - Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [54.079327030892244]
Free-HeadGAN is a person-generic neural talking head synthesis system.
We show that modeling faces with sparse 3D facial landmarks are sufficient for achieving state-of-the-art generative performance.
arXiv Detail & Related papers (2022-08-03T16:46:08Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - Towards a Neural Graphics Pipeline for Controllable Image Generation [96.11791992084551]
We present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models.
NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation.
We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes.
arXiv Detail & Related papers (2020-06-18T14:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.