The Geometry of Deep Generative Image Models and its Applications
- URL: http://arxiv.org/abs/2101.06006v2
- Date: Thu, 18 Mar 2021 08:24:26 GMT
- Title: The Geometry of Deep Generative Image Models and its Applications
- Authors: Binxu Wang, Carlos R. Ponce
- Abstract summary: Generative adversarial networks (GANs) have emerged as a powerful unsupervised method to model the statistical patterns of real-world data sets.
These networks are trained to map random inputs in their latent space to new samples representative of the learned data.
The structure of the latent space is hard to intuit due to its high dimensionality and the non-linearity of the generator.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Generative adversarial networks (GANs) have emerged as a powerful
unsupervised method to model the statistical patterns of real-world data sets,
such as natural images. These networks are trained to map random inputs in
their latent space to new samples representative of the learned data. However,
the structure of the latent space is hard to intuit due to its high
dimensionality and the non-linearity of the generator, which limits the
usefulness of the models. Understanding the latent space requires a way to
identify input codes for existing real-world images (inversion), and a way to
identify directions with known image transformations (interpretability). Here,
we use a geometric framework to address both issues simultaneously. We develop
an architecture-agnostic method to compute the Riemannian metric of the image
manifold created by GANs. The eigen-decomposition of the metric isolates axes
that account for different levels of image variability. An empirical analysis
of several pretrained GANs shows that image variation around each position is
concentrated along surprisingly few major axes (the space is highly
anisotropic) and the directions that create this large variation are similar at
different positions in the space (the space is homogeneous). We show that many
of the top eigenvectors correspond to interpretable transforms in the image
space, with a substantial part of eigenspace corresponding to minor transforms
which could be compressed out. This geometric understanding unifies key
previous results related to GAN interpretability. We show that the use of this
metric allows for more efficient optimization in the latent space (e.g. GAN
inversion) and facilitates unsupervised discovery of interpretable axes. Our
results illustrate that defining the geometry of the GAN image manifold can
serve as a general framework for understanding GANs.
Related papers
- Neural Isometries: Taming Transformations for Equivariant ML [8.203292895010748]
We introduce Neural Isometries, an autoencoder framework which learns to map the observation space to a general-purpose latent space.
We show that a simple off-the-shelf equivariant network operating in the pre-trained latent space can achieve results on par with meticulously-engineered, handcrafted networks.
arXiv Detail & Related papers (2024-05-29T17:24:25Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Geometric Scattering on Measure Spaces [12.0756034112778]
We introduce a general, unified model for geometric scattering on measure spaces.
We consider finite measure spaces that are obtained from randomly sampling an unknown manifold.
We propose two methods for constructing a data-driven graph on which the associated graph scattering transform approximates the scattering transform on the underlying manifold.
arXiv Detail & Related papers (2022-08-17T22:40:09Z) - Leveraging Equivariant Features for Absolute Pose Regression [9.30597356471664]
We show that a translation and rotation equivariant Convolutional Neural Network directly induces representations of camera motions into the feature space.
We then show that this geometric property allows for implicitly augmenting the training data under a whole group of image plane-preserving transformations.
arXiv Detail & Related papers (2022-04-05T12:44:20Z) - Rayleigh EigenDirections (REDs): GAN latent space traversals for
multidimensional features [20.11085769303415]
We present a method for finding paths in a deep generative model's latent space.
We can manipulate multidimensional features of an image such as facial identity and pixels within a region.
Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.
arXiv Detail & Related papers (2022-01-25T16:11:33Z) - Low-Rank Subspaces in GANs [101.48350547067628]
This work introduces low-rank subspaces that enable more precise control of GAN generation.
LowRankGAN is able to find the low-dimensional representation of attribute manifold.
Experiments on state-of-the-art GAN models (including StyleGAN2 and BigGAN) trained on various datasets demonstrate the effectiveness of our LowRankGAN.
arXiv Detail & Related papers (2021-06-08T16:16:32Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - Diamond in the rough: Improving image realism by traversing the GAN
latent space [0.0]
We present an unsupervised method to find a direction in the latent space that aligns with improved photo-realism.
Our approach leaves the network unchanged while enhancing the fidelity of the generated image.
We use a simple generator inversion to find the direction in the latent space that results in the smallest change in the image space.
arXiv Detail & Related papers (2021-04-12T14:45:29Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.