EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders
- URL: http://arxiv.org/abs/2310.05718v3
- Date: Mon, 15 Jul 2024 09:57:48 GMT
- Title: EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders
- Authors: Gulcin Baykal, Melih Kandemir, Gozde Unal,
- Abstract summary: Codebook collapse is a common problem in training deep generative models with discrete representation spaces.
We propose a novel way to incorporate evidential deep learning (EDL) instead of softmax to combat the codebook collapse problem of dVAE.
- Score: 11.086500036180222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Codebook collapse is a common problem in training deep generative models with discrete representation spaces like Vector Quantized Variational Autoencoders (VQ-VAEs). We observe that the same problem arises for the alternatively designed discrete variational autoencoders (dVAEs) whose encoder directly learns a distribution over the codebook embeddings to represent the data. We hypothesize that using the softmax function to obtain a probability distribution causes the codebook collapse by assigning overconfident probabilities to the best matching codebook elements. In this paper, we propose a novel way to incorporate evidential deep learning (EDL) instead of softmax to combat the codebook collapse problem of dVAE. We evidentially monitor the significance of attaining the probability distribution over the codebook embeddings, in contrast to softmax usage. Our experiments using various datasets show that our model, called EdVAE, mitigates codebook collapse while improving the reconstruction performance, and enhances the codebook usage compared to dVAE and VQ-VAE based models. Our code can be found at https://github.com/ituvisionlab/EdVAE .
Related papers
- An Independence-promoting Loss for Music Generation with Language Models [64.95095558672996]
Music generation schemes rely on a vocabulary of audio tokens, generally provided as codes in a discrete latent space learnt by an auto-encoder.
We introduce an independence-promoting loss to regularize the auto-encoder used as the tokenizer in language models for music generation.
arXiv Detail & Related papers (2024-06-04T13:44:39Z) - Online Clustered Codebook [100.1650001618827]
We present a simple alternative method for online codebook learning, Clustering VQ-VAE (CVQ-VAE)
Our approach selects encoded features as anchors to update the dead'' codevectors, while optimising the codebooks which are alive via the original loss.
Our CVQ-VAE can be easily integrated into the existing models with just a few lines of code.
arXiv Detail & Related papers (2023-07-27T18:31:04Z) - Towards Accurate Image Coding: Improved Autoregressive Image Generation
with Dynamic Vector Quantization [73.52943587514386]
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm.
We propose a novel two-stage framework: (1) Dynamic-Quantization VAE (DQ-VAE) which encodes image regions into variable-length codes based their information densities for accurate representation.
arXiv Detail & Related papers (2023-05-19T14:56:05Z) - Variational Diffusion Auto-encoder: Latent Space Extraction from
Pre-trained Diffusion Models [0.0]
Variational Auto-Encoders (VAEs) face challenges with the quality of generated images, often presenting noticeable blurriness.
This issue stems from the unrealistic assumption that approximates the conditional data distribution, $p(textbfx | textbfz)$, as an isotropic Gaussian.
We illustrate how one can extract a latent space from a pre-existing diffusion model by optimizing an encoder to maximize the marginal data log-likelihood.
arXiv Detail & Related papers (2023-04-24T14:44:47Z) - Robust Vector Quantized-Variational Autoencoder [13.664682865991255]
We propose a robust generative model based on Vector Quantized-Variational AutoEncoder (VQ-VAE)
In order to achieve robustness, RVQ-VAE uses two separate codebooks for the inliers and outliers.
We experimentally demonstrate that RVQ-VAE is able to generate examples from inliers even if a large portion of the training data points are corrupted.
arXiv Detail & Related papers (2022-02-04T05:51:15Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Variational Autoencoder with Embedded Student-$t$ Mixture Model for
Authorship Attribution [13.196225569878761]
Given a finite set of candidate authors and corresponding labeled texts, the objective is to determine which of the authors has written another set of anonymous or disputed texts.
We propose a probabilistic autoencoding framework to deal with this supervised classification task.
Experiments over an Amazon review dataset indicate superior performance of the proposed method.
arXiv Detail & Related papers (2020-05-28T11:52:32Z) - Variational Hyper-Encoding Networks [62.74164588885455]
We propose a framework called HyperVAE for encoding distributions of neural network parameters theta.
We predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(theta)
arXiv Detail & Related papers (2020-05-18T06:46:09Z) - Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse.
We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z) - Deterministic Decoding for Discrete Data in Variational Autoencoders [5.254093731341154]
We study a VAE model with a deterministic decoder (DD-VAE) for sequential data that selects the highest-scoring tokens instead of sampling.
We demonstrate the performance of DD-VAE on multiple datasets, including molecular generation and optimization problems.
arXiv Detail & Related papers (2020-03-04T16:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.