Pose Manipulation with Identity Preservation
- URL: http://arxiv.org/abs/2004.09169v1
- Date: Mon, 20 Apr 2020 09:51:31 GMT
- Title: Pose Manipulation with Identity Preservation
- Authors: Andrei-Timotei Ardelean, Lucian Mircea Sasu
- Abstract summary: We introduce Character Adaptive Identity Normalization GAN (CainGAN) which uses spatial characteristic features extracted by an embedder and combined across source images.
CainGAN receives figures of faces from a certain individual and produces new ones while preserving the person's identity.
Experimental results show that the quality of generated images scales with the size of the input set used during inference.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes a new model which generates images in novel poses e.g.
by altering face expression and orientation, from just a few instances of a
human subject. Unlike previous approaches which require large datasets of a
specific person for training, our approach may start from a scarce set of
images, even from a single image. To this end, we introduce Character Adaptive
Identity Normalization GAN (CainGAN) which uses spatial characteristic features
extracted by an embedder and combined across source images. The identity
information is propagated throughout the network by applying conditional
normalization. After extensive adversarial training, CainGAN receives figures
of faces from a certain individual and produces new ones while preserving the
person's identity. Experimental results show that the quality of generated
images scales with the size of the input set used during inference.
Furthermore, quantitative measurements indicate that CainGAN performs better
compared to other methods when training data is limited.
Related papers
- Stellar: Systematic Evaluation of Human-Centric Personalized
Text-to-Image Methods [52.806258774051216]
We focus on text-to-image systems that input a single image of an individual and ground the generation process along with text describing the desired visual context.
We introduce a standardized dataset (Stellar) that contains personalized prompts coupled with images of individuals that is an order of magnitude larger than existing relevant datasets and where rich semantic ground-truth annotations are readily available.
We derive a simple yet efficient, personalized text-to-image baseline that does not require test-time fine-tuning for each subject and which sets quantitatively and in human trials a new SoTA.
arXiv Detail & Related papers (2023-12-11T04:47:39Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - CleftGAN: Adapting A Style-Based Generative Adversarial Network To
Create Images Depicting Cleft Lip Deformity [2.1647227058902336]
We have built a deep learning-based cleft lip generator designed to produce an almost unlimited number of artificial images exhibiting high-fidelity facsimiles of cleft lip.
We undertook a transfer learning protocol testing different versions of StyleGAN-ADA.
Training images depicting a variety of cleft deformities were pre-processed to adjust for rotation, scaling, color adjustment and background blurring.
arXiv Detail & Related papers (2023-10-12T01:25:21Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - Identity Encoder for Personalized Diffusion [57.1198884486401]
We propose an encoder-based approach for personalization.
We learn an identity encoder which can extract an identity representation from a set of reference images of a subject.
We show that our approach consistently outperforms existing fine-tuning based approach in both image generation and reconstruction.
arXiv Detail & Related papers (2023-04-14T23:32:24Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Few-shot Image Generation via Masked Discrimination [20.998032566820907]
Few-shot image generation aims to generate images of high quality and great diversity with limited data.
It is difficult for modern GANs to avoid overfitting when trained on only a few images.
This work presents a novel approach to realize few-shot GAN adaptation via masked discrimination.
arXiv Detail & Related papers (2022-10-27T06:02:22Z) - MorphGAN: One-Shot Face Synthesis GAN for Detecting Recognition Bias [13.162012586770576]
We describe a simulator that applies specific head pose and facial expression adjustments to images of previously unseen people.
We show that by augmenting small datasets of faces with new poses and expressions improves the recognition performance by up to 9% depending on the augmentation and data scarcity.
arXiv Detail & Related papers (2020-12-09T18:43:03Z) - Person image generation with semantic attention network for person
re-identification [9.30413920076019]
We propose a novel person pose-guided image generation method, which is called the semantic attention network.
The network consists of several semantic attention blocks, where each block attends to preserve and update the pose code and the clothing textures.
Compared with other methods, our network can characterize better body shape and keep clothing attributes, simultaneously.
arXiv Detail & Related papers (2020-08-18T12:18:51Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.