Semantic and Geometric Unfolding of StyleGAN Latent Space
- URL: http://arxiv.org/abs/2107.04481v1
- Date: Fri, 9 Jul 2021 15:12:55 GMT
- Title: Semantic and Geometric Unfolding of StyleGAN Latent Space
- Authors: Mustafa Shukor, Xu Yao, Bharath Bhushan Damodaran, Pierre Hellier
- Abstract summary: Generative adversarial networks (GANs) have proven to be surprisingly efficient for image editing.
In this paper, we identify two geometric limitations of such latent space.
We propose a new method to learn a proxy latent representation using normalizing flows to remedy these limitations.
- Score: 2.7910505923792646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative adversarial networks (GANs) have proven to be surprisingly
efficient for image editing by inverting and manipulating the latent code
corresponding to a natural image. This property emerges from the disentangled
nature of the latent space. In this paper, we identify two geometric
limitations of such latent space: (a) euclidean distances differ from image
perceptual distance, and (b) disentanglement is not optimal and facial
attribute separation using linear model is a limiting hypothesis. We thus
propose a new method to learn a proxy latent representation using normalizing
flows to remedy these limitations, and show that this leads to a more efficient
space for face image editing.
Related papers
- IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models [24.382275473592046]
We present a diffusion-based image morphing approach with perceptually-uniform sampling (IMPUS)
IMPUS produces smooth, direct and realistic adaptations given an image pair.
arXiv Detail & Related papers (2023-11-12T10:03:32Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - Semantic Unfolding of StyleGAN Latent Space [0.7646713951724012]
Generative adversarial networks (GANs) have proven to be surprisingly efficient for image editing by inverting and manipulating the latent code corresponding to an input real image.
This editing property emerges from the disentangled nature of the latent space.
In this paper, we identify that the facial attribute disentanglement is not optimal, thus facial editing relying on linear attribute separation is flawed.
arXiv Detail & Related papers (2022-06-29T20:22:10Z) - Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image
Translation [56.44946660061753]
This paper proposes a universal regularization technique called maximum spatial perturbation consistency (MSPC)
MSPC enforces a spatial perturbation function (T ) and the translation operator (G) to be commutative (i.e., TG = GT )
Our method outperforms the state-of-the-art methods on most I2I benchmarks.
arXiv Detail & Related papers (2022-03-23T19:59:04Z) - Rayleigh EigenDirections (REDs): GAN latent space traversals for
multidimensional features [20.11085769303415]
We present a method for finding paths in a deep generative model's latent space.
We can manipulate multidimensional features of an image such as facial identity and pixels within a region.
Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.
arXiv Detail & Related papers (2022-01-25T16:11:33Z) - Learning Discriminative Shrinkage Deep Networks for Image Deconvolution [122.79108159874426]
We propose an effective non-blind deconvolution approach by learning discriminative shrinkage functions to implicitly model these terms.
Experimental results show that the proposed method performs favorably against the state-of-the-art ones in terms of efficiency and accuracy.
arXiv Detail & Related papers (2021-11-27T12:12:57Z) - High Resolution Face Editing with Masked GAN Latent Code Optimization [0.0]
Face editing is a popular research topic in the computer vision community.
Recent proposed methods are based on either training a conditional encoder-decoder Generative Adversarial Network (GAN) in an end-to-end fashion or on defining an operation in the latent space of a pre-trained vanilla GAN generator model.
We propose a GAN embedding optimization procedure with spatial and semantic constraints.
arXiv Detail & Related papers (2021-03-20T08:39:41Z) - The Geometry of Deep Generative Image Models and its Applications [0.0]
Generative adversarial networks (GANs) have emerged as a powerful unsupervised method to model the statistical patterns of real-world data sets.
These networks are trained to map random inputs in their latent space to new samples representative of the learned data.
The structure of the latent space is hard to intuit due to its high dimensionality and the non-linearity of the generator.
arXiv Detail & Related papers (2021-01-15T07:57:33Z) - Joint Estimation of Image Representations and their Lie Invariants [57.3768308075675]
Images encode both the state of the world and its content.
The automatic extraction of this information is challenging because of the high-dimensionality and entangled encoding inherent to the image representation.
This article introduces two theoretical approaches aimed at the resolution of these challenges.
arXiv Detail & Related papers (2020-12-05T00:07:41Z) - Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.
Current convention is to approach this task with cycle-consistent GANs.
We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.