Geometry-Preserving Encoder/Decoder in Latent Generative Models
- URL: http://arxiv.org/abs/2501.09876v1
- Date: Thu, 16 Jan 2025 23:14:34 GMT
- Title: Geometry-Preserving Encoder/Decoder in Latent Generative Models
- Authors: Wonjun Lee, Riley C. W. O'Neill, Dongmian Zou, Jeff Calder, Gilad Lerman,
- Abstract summary: We introduce a novel encoder/decoder framework with theoretical properties distinct from those of the VAE.
We demonstrate the significant advantages of this geometry-preserving encoder in the training process of both the encoder and decoder.
- Score: 13.703752179071333
- License:
- Abstract: Generative modeling aims to generate new data samples that resemble a given dataset, with diffusion models recently becoming the most popular generative model. One of the main challenges of diffusion models is solving the problem in the input space, which tends to be very high-dimensional. Recently, solving diffusion models in the latent space through an encoder that maps from the data space to a lower-dimensional latent space has been considered to make the training process more efficient and has shown state-of-the-art results. The variational autoencoder (VAE) is the most commonly used encoder/decoder framework in this domain, known for its ability to learn latent representations and generate data samples. In this paper, we introduce a novel encoder/decoder framework with theoretical properties distinct from those of the VAE, specifically designed to preserve the geometric structure of the data distribution. We demonstrate the significant advantages of this geometry-preserving encoder in the training process of both the encoder and decoder. Additionally, we provide theoretical results proving convergence of the training process, including convergence guarantees for encoder training, and results showing faster convergence of decoder training when using the geometry-preserving encoder.
Related papers
- Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder [49.01721042973929]
This paper presents a diffusion-based image compression method that employs a privileged end-to-end decoder model as correction.
Experiments demonstrate the superiority of our method in both distortion and perception compared with previous perceptual compression methods.
arXiv Detail & Related papers (2024-04-07T10:57:54Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - Learning Nonparametric High-Dimensional Generative Models: The
Empirical-Beta-Copula Autoencoder [1.5714999163044752]
It is necessary to model the autoencoder's latent space with a distribution from which samples can be obtained.
This study aims to discuss, assess, and compare various techniques that can be used to capture the latent space.
arXiv Detail & Related papers (2023-09-18T16:29:36Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Hierarchical Attention Encoder Decoder [2.4366811507669115]
Autoregressive modeling can generate complex and novel sequences that have many real-world applications.
These models must generate outputs autoregressively, which becomes time-consuming when dealing with long sequences.
We propose a model based on the Hierarchical Recurrent Decoder architecture.
arXiv Detail & Related papers (2023-06-01T18:17:23Z) - Benign Autoencoders [0.0]
We formalize the problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE)
We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem.
As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift.
arXiv Detail & Related papers (2022-10-02T21:36:27Z) - String-based Molecule Generation via Multi-decoder VAE [56.465033997245776]
We investigate the problem of string-based molecular generation via variational autoencoders (VAEs)
We propose a simple, yet effective idea to improve the performance of VAE for the task.
In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.
arXiv Detail & Related papers (2022-08-23T03:56:30Z) - Toward a Geometrical Understanding of Self-supervised Contrastive
Learning [55.83778629498769]
Self-supervised learning (SSL) is one of the premier techniques to create data representations that are actionable for transfer learning in the absence of human annotations.
Mainstream SSL techniques rely on a specific deep neural network architecture with two cascaded neural networks: the encoder and the projector.
In this paper, we investigate how the strength of the data augmentation policies affects the data embedding.
arXiv Detail & Related papers (2022-05-13T23:24:48Z) - ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking
Inference [70.36083572306839]
This paper proposes a new training and inference paradigm for re-ranking.
We finetune a pretrained encoder-decoder model using in the form of document to query generation.
We show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference.
arXiv Detail & Related papers (2022-04-25T06:26:29Z) - Encoded Prior Sliced Wasserstein AutoEncoder for learning latent
manifold representations [0.7614628596146599]
We introduce an Encoded Prior Sliced Wasserstein AutoEncoder.
An additional prior-encoder network learns an embedding of the data manifold.
We show that the prior encodes the geometry underlying the data unlike conventional autoencoders.
arXiv Detail & Related papers (2020-10-02T14:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.