Related papers: Expanding the Latent Space of StyleGAN for Real Face Editing

Expanding the Latent Space of StyleGAN for Real Face Editing

URL: http://arxiv.org/abs/2204.12530v1
Date: Tue, 26 Apr 2022 18:27:53 GMT
Title: Expanding the Latent Space of StyleGAN for Real Face Editing
Authors: Yin Yu, Ghasedi Kamran, Wu HsiangTao, Yang Jiaolong, Tong Xi, Fu Yun
Abstract summary: A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation. To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables. We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
Score: 4.1715767752637145
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, a surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation. To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables. However, it is still challenging to find latent variables, which have the capacity for preserving the appearance of the input subject (e.g., identity, lighting, hairstyles) as well as enabling meaningful manipulations. In this paper, we present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability. Specifically, we proposed a two-branch model, where the style branch first tackles the entanglement issue by the sparse manipulation of latent codes, and the content branch then mitigates the distortion issue by leveraging the content and appearance details from the input image. We confirm the effectiveness of our method using extensive qualitative and quantitative experiments on real face editing and reconstruction tasks.

Related papers

S$^2$Edit: Text-Guided Image Editing with Precise Semantic and Spatial Control [29.031157601804953]
S$2$Edit is a text-to-image diffusion model that enables personalized editing with precise semantic and spatial control.<n>We show that S$2$Edit performs localized editing while faithfully preserving the original identity with semantically disentangled and spatially focused identity token learned.
arXiv Detail & Related papers (2025-07-07T00:14:08Z)
Warping the Residuals for Image Editing with StyleGAN [5.733811543584874]
StyleGAN models show editing capabilities via their semantically interpretable latent organizations. Many works have been proposed for inverting images into StyleGAN's latent space. We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
arXiv Detail & Related papers (2023-12-18T18:24:18Z)
When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images. We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models. Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z)
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing [115.49488548588305]
A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images. They either finetune the model, or invert the image in the latent space of the pretrained model. They suffer from two problems: Unsatisfying results for selected regions and unexpected changes in non-selected regions.
arXiv Detail & Related papers (2023-03-28T00:16:45Z)
StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces [103.54337984566877]
We use dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN without altering any model parameters. This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions. We validate our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks.
arXiv Detail & Related papers (2023-03-10T18:59:33Z)
Towards Counterfactual Image Manipulation via CLIP [106.94502632502194]
Existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images. We investigate this problem in a text-driven manner with Contrastive-Language-Image-Pretraining (CLIP) We design a novel contrastive loss that exploits predefined CLIP-space directions to guide the editing toward desired directions from different perspectives.
arXiv Detail & Related papers (2022-07-06T17:02:25Z)
Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results. We show that this allows us to obtain near-perfect image reconstruction without the need for encoders. Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z)
Pivotal Tuning for Latent-based Editing of Real Images [40.22151052441958]
A surge of advanced facial editing techniques have been proposed that leverage the generative power of a pre-trained StyleGAN. To successfully edit an image this way, one must first project (or invert) the image into the pre-trained generator's domain. This means it is still challenging to apply ID-preserving facial latent-space editing to faces which are out of the generator's domain.
arXiv Detail & Related papers (2021-06-10T13:47:59Z)
Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator. We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space. We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z)
Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks. Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism. We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z)
PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN. We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.