Relaxed-Responsibility Hierarchical Discrete VAEs
- URL: http://arxiv.org/abs/2007.07307v2
- Date: Thu, 4 Feb 2021 18:59:59 GMT
- Title: Relaxed-Responsibility Hierarchical Discrete VAEs
- Authors: Matthew Willetts, Xenia Miscouridou, Stephen Roberts, Chris Holmes
- Abstract summary: We introduce textitRelaxed-Responsibility Vector-Quantisation, a novel way to parameterise discrete latent variables.
We achieve state-of-the-art bits-per-dim results for various standard datasets.
- Score: 3.976291254896486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Successfully training Variational Autoencoders (VAEs) with a hierarchy of
discrete latent variables remains an area of active research.
Vector-Quantised VAEs are a powerful approach to discrete VAEs, but naive
hierarchical extensions can be unstable when training. Leveraging insights from
classical methods of inference we introduce \textit{Relaxed-Responsibility
Vector-Quantisation}, a novel way to parameterise discrete latent variables, a
refinement of relaxed Vector-Quantisation that gives better performance and
more stable training. This enables a novel approach to hierarchical discrete
variational autoencoders with numerous layers of latent variables (here up to
32) that we train end-to-end. Within hierarchical probabilistic deep generative
models with discrete latent variables trained end-to-end, we achieve
state-of-the-art bits-per-dim results for various standard datasets. % Unlike
discrete VAEs with a single layer of latent variables, we can produce samples
by ancestral sampling: it is not essential to train a second autoregressive
generative model over the learnt latent representations to then sample from and
then decode. % Moreover, that latter approach in these deep hierarchical models
would require thousands of forward passes to generate a single sample. Further,
we observe different layers of our model become associated with different
aspects of the data.
Related papers
- GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization.
We propose the use of GFlowNets, algorithms for sampling from an unnormalized density.
By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Entropy optimized semi-supervised decomposed vector-quantized
variational autoencoder model based on transfer learning for multiclass text
classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation.
The proposed model employs the concept of transfer learning for training a quantized transformer model.
Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Hierarchical Few-Shot Generative Models [18.216729811514718]
We study a latent variables approach that extends the Neural Statistician to a fully hierarchical approach with an attention-based point to set-level aggregation.
Our results show that the hierarchical formulation better captures the intrinsic variability within the sets in the small data regime.
arXiv Detail & Related papers (2021-10-23T19:19:39Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.