Related papers: Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation

URL: http://arxiv.org/abs/2102.01187v2
Date: Wed, 3 Feb 2021 07:21:18 GMT
Title: Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation
Authors: Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing
Abstract summary: Controllable semantic image editing enables a user to change entire image attributes with few clicks. Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism. We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
Score: 136.53288628437355
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Controllable semantic image editing enables a user to change entire image attributes with few clicks, e.g., gradually making a summer scene look like it was taken in winter. Classic approaches for this task use a Generative Adversarial Net (GAN) to learn a latent space and suitable latent-space transformations. However, current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism. To address these concerns, we learn multiple attribute transformations simultaneously, we integrate attribute regression into the training of transformation functions, apply a content loss and an adversarial loss that encourage the maintenance of image identity and photo-realism. We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation. Our model permits better control for both single- and multiple-attribute editing, while also preserving image identity and realism during transformation. We provide empirical results for both real and synthetic images, highlighting that our model achieves state-of-the-art performance for targeted image manipulation.

Related papers

Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition. It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects. Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z)
Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT) We propose an automatic method to identify "vital layers" within DiT, crucial for image formation. Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z)
A Compact and Semantic Latent Space for Disentangled and Controllable Image Editing [4.8201607588546]
We propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space. We show that our approach has greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity.
arXiv Detail & Related papers (2023-12-13T16:18:45Z)
VecGAN: Image-to-Image Translation with Interpretable Latent Directions [4.7590051176368915]
VecGAN is an image-to-image translation framework for facial attribute editing with interpretable latent directions. VecGAN achieves significant improvements over state-of-the-arts for both local and global edits.
arXiv Detail & Related papers (2022-07-07T16:31:05Z)
End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain. We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z)
Expanding the Latent Space of StyleGAN for Real Face Editing [4.1715767752637145]
A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation. To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables. We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
arXiv Detail & Related papers (2022-04-26T18:27:53Z)
One-shot domain adaptation for semantic face editing of real world images using StyleALAE [7.541747299649292]
styleALAE is a latent-space based autoencoder that can generate photo-realistic images of high quality. Our work ensures that the identity of the reconstructed image is the same as the given input image. We further generate semantic modifications over the reconstructed image by using the latent space of the pre-trained styleALAE model.
arXiv Detail & Related papers (2021-08-31T14:32:18Z)
PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN. We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
Look here! A parametric learning based approach to redirect visual attention [49.609412873346386]
We introduce an automatic method to make an image region more attention-capturing via subtle image edits. Our model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions. Our edits enable inference at interactive rates on any image size, and easily generalize to videos.
arXiv Detail & Related papers (2020-08-12T16:08:36Z)
Semantic Photo Manipulation with a Generative Image Prior [86.01714863596347]
GANs are able to synthesize images conditioned on inputs such as user sketch, text, or semantic labels. It is hard for GANs to precisely reproduce an input image. In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image. Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image.
arXiv Detail & Related papers (2020-05-15T18:22:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.