Complexity Matters: Rethinking the Latent Space for Generative Modeling
- URL: http://arxiv.org/abs/2307.08283v2
- Date: Sun, 29 Oct 2023 13:13:00 GMT
- Title: Complexity Matters: Rethinking the Latent Space for Generative Modeling
- Authors: Tianyang Hu, Fei Chen, Haonan Wang, Jiawei Li, Wenjia Wang, Jiacheng
Sun, Zhenguo Li
- Abstract summary: In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
- Score: 65.64763873078114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In generative modeling, numerous successful approaches leverage a
low-dimensional latent space, e.g., Stable Diffusion models the latent space
induced by an encoder and generates images through a paired decoder. Although
the selection of the latent space is empirically pivotal, determining the
optimal choice and the process of identifying it remain unclear. In this study,
we aim to shed light on this under-explored topic by rethinking the latent
space from the perspective of model complexity. Our investigation starts with
the classic generative adversarial networks (GANs). Inspired by the GAN
training objective, we propose a novel "distance" between the latent and data
distributions, whose minimization coincides with that of the generator
complexity. The minimizer of this distance is characterized as the optimal
data-dependent latent that most effectively capitalizes on the generator's
capacity. Then, we consider parameterizing such a latent distribution by an
encoder network and propose a two-stage training strategy called Decoupled
Autoencoder (DAE), where the encoder is only updated in the first stage with an
auxiliary decoder and then frozen in the second stage while the actual decoder
is being trained. DAE can improve the latent distribution and as a result,
improve the generative performance. Our theoretical analyses are corroborated
by comprehensive experiments on various models such as VQGAN and Diffusion
Transformer, where our modifications yield significant improvements in sample
quality with decreased model complexity.
Related papers
- Neural Network Parameter Diffusion [50.85251415173792]
Diffusion models have achieved remarkable success in image and video generation.
In this work, we demonstrate that diffusion models can also.
generate high-performing neural network parameters.
arXiv Detail & Related papers (2024-02-20T16:59:03Z) - Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction [57.16121098944589]
RDA is a pioneering approach designed to address two primary deficiencies prevalent in previous endeavors aiming at stealing pre-trained encoders.
It is accomplished via a sample-wise prototype, which consolidates the target encoder's representations for a given sample's various perspectives.
For more potent efficacy, we develop a multi-relational extraction loss that trains the surrogate encoder to Discriminate mismatched embedding-prototype pairs.
arXiv Detail & Related papers (2023-12-01T15:03:29Z) - Variational Diffusion Auto-encoder: Latent Space Extraction from
Pre-trained Diffusion Models [0.0]
Variational Auto-Encoders (VAEs) face challenges with the quality of generated images, often presenting noticeable blurriness.
This issue stems from the unrealistic assumption that approximates the conditional data distribution, $p(textbfx | textbfz)$, as an isotropic Gaussian.
We illustrate how one can extract a latent space from a pre-existing diffusion model by optimizing an encoder to maximize the marginal data log-likelihood.
arXiv Detail & Related papers (2023-04-24T14:44:47Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - String-based Molecule Generation via Multi-decoder VAE [56.465033997245776]
We investigate the problem of string-based molecular generation via variational autoencoders (VAEs)
We propose a simple, yet effective idea to improve the performance of VAE for the task.
In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.
arXiv Detail & Related papers (2022-08-23T03:56:30Z) - Diffusion bridges vector quantized Variational AutoEncoders [0.0]
We show that our model is competitive with the autoregressive prior on the mini-Imagenet dataset.
Our framework also extends the standard VQ-VAE and enables end-to-end training.
arXiv Detail & Related papers (2022-02-10T08:38:12Z) - Generation of data on discontinuous manifolds via continuous stochastic
non-invertible networks [6.201770337181472]
We show how to generate discontinuous distributions using continuous networks.
We derive a link between the cost functions and the information-theoretic formulation.
We apply our approach to synthetic 2D distributions to demonstrate both reconstruction and generation of discontinuous distributions.
arXiv Detail & Related papers (2021-12-17T17:39:59Z) - Latent reweighting, an almost free improvement for GANs [12.605607949417033]
A line of works aims at improving the sampling quality from pre-trained generators at the expense of increased computational cost.
We introduce an additional network to predict latent importance weights and two associated sampling methods to avoid the poorest samples.
arXiv Detail & Related papers (2021-10-19T08:33:57Z) - Variance Constrained Autoencoding [0.0]
We show that for encoders, simultaneously attempting to enforce a distribution constraint and minimising an output distortion leads to a reduction in generative and reconstruction quality.
We propose the variance-constrained autoencoder (VCAE), which only enforces a variance constraint on the latent distribution.
Our experiments show that VCAE improves upon Wasserstein Autoencoder and the Variational Autoencoder in both reconstruction and generative quality on MNIST and CelebA.
arXiv Detail & Related papers (2020-05-08T00:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.