Rethinking VAE: From Continuous to Discrete Representations Without Probabilistic Assumptions
- URL: http://arxiv.org/abs/2507.17255v1
- Date: Wed, 23 Jul 2025 06:52:00 GMT
- Title: Rethinking VAE: From Continuous to Discrete Representations Without Probabilistic Assumptions
- Authors: Songxuan Shi,
- Abstract summary: We establish connections between Variational Autoencoders (VAEs) and Vector Quantized-Variational Autoencoders (VQ-VAEs) through a reformulated training framework.<n>We propose a new VAE-like training method that introduces clustering centers to enhance data compactness and ensure well-defined latent spaces.<n> Experimental results on MNIST, CelebA, and FashionMNIST datasets show smooth interpolative transitions, though blurriness persists.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the generative capabilities of Autoencoders (AEs) and establishes connections between Variational Autoencoders (VAEs) and Vector Quantized-Variational Autoencoders (VQ-VAEs) through a reformulated training framework. We demonstrate that AEs exhibit generative potential via latent space interpolation and perturbation, albeit limited by undefined regions in the encoding space. To address this, we propose a new VAE-like training method that introduces clustering centers to enhance data compactness and ensure well-defined latent spaces without relying on traditional KL divergence or reparameterization techniques. Experimental results on MNIST, CelebA, and FashionMNIST datasets show smooth interpolative transitions, though blurriness persists. Extending this approach to multiple learnable vectors, we observe a natural progression toward a VQ-VAE-like model in continuous space. However, when the encoder outputs multiple vectors, the model degenerates into a discrete Autoencoder (VQ-AE), which combines image fragments without learning semantic representations. Our findings highlight the critical role of encoding space compactness and dispersion in generative modeling and provide insights into the intrinsic connections between VAEs and VQ-VAEs, offering a new perspective on their design and limitations.
Related papers
- Cross-Layer Discrete Concept Discovery for Interpreting Language Models [13.842670153893977]
Cross-layer VQ-VAE is a framework that uses vector quantization to map representations across layers.<n>Our approach uniquely combines top-k temperature-based sampling during quantization with EMA codebook updates.
arXiv Detail & Related papers (2025-06-24T22:43:36Z) - Interpretable Spectral Variational AutoEncoder (ISVAE) for time series
clustering [48.0650332513417]
We introduce a novel model that incorporates an interpretable bottleneck-termed the Filter Bank (FB)-at the outset of a Variational Autoencoder (VAE)
This arrangement compels the VAE to attend on the most informative segments of the input signal.
By deliberately constraining the VAE with this FB, we promote the development of an encoding that is discernible, separable, and of reduced dimensionality.
arXiv Detail & Related papers (2023-10-18T13:06:05Z) - How to train your VAE [0.0]
Variational Autoencoders (VAEs) have become a cornerstone in generative modeling and representation learning within machine learning.
This paper explores interpreting the Kullback-Leibler (KL) Divergence, a critical component within the Evidence Lower Bound (ELBO)
The proposed method redefines the ELBO with a mixture of Gaussians for the posterior probability, introduces a regularization term, and employs a PatchGAN discriminator to enhance texture realism.
arXiv Detail & Related papers (2023-09-22T19:52:28Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Vector Quantized Wasserstein Auto-Encoder [57.29764749855623]
We study learning deep discrete representations from the generative viewpoint.
We endow discrete distributions over sequences of codewords and learn a deterministic decoder that transports the distribution over the sequences of codewords to the data distribution.
We develop further theories to connect it with the clustering viewpoint of WS distance, allowing us to have a better and more controllable clustering solution.
arXiv Detail & Related papers (2023-02-12T13:51:36Z) - CCVS: Context-aware Controllable Video Synthesis [95.22008742695772]
presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones.
It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control.
arXiv Detail & Related papers (2021-07-16T17:57:44Z) - Neighbor Embedding Variational Autoencoder [14.08587678497785]
We propose a novel model, neighbor embedding VAE(NE-VAE), which explicitly constraints the encoder to encode inputs close in the input space to be close in the latent space.
In our experiments, NE-VAE can produce qualitatively different latent representations with majority of the latent dimensions remained active.
arXiv Detail & Related papers (2021-03-21T09:49:12Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.