ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic
Divergence
- URL: http://arxiv.org/abs/2205.04104v1
- Date: Mon, 9 May 2022 08:11:46 GMT
- Title: ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic
Divergence
- Authors: Sangshin Oh, Seyun Um, Hong-Goo Kang
- Abstract summary: We present a novel divergence-like metric which corresponds to the upper bound of the Kullback-Leibler divergence (KLD) of a relaxed categorical distribution.
We also propose a relaxed categorical analytic bound variational autoencoder (ReCAB-VAE) that successfully models both continuous and relaxed latent representations.
- Score: 17.665255113864795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Gumbel-softmax distribution, or Concrete distribution, is often used to
relax the discrete characteristics of a categorical distribution and enable
back-propagation through differentiable reparameterization. Although it
reliably yields low variance gradients, it still relies on a stochastic
sampling process for optimization. In this work, we present a relaxed
categorical analytic bound (ReCAB), a novel divergence-like metric which
corresponds to the upper bound of the Kullback-Leibler divergence (KLD) of a
relaxed categorical distribution. The proposed metric is easy to implement
because it has a closed form solution, and empirical results show that it is
close to the actual KLD. Along with this new metric, we propose a relaxed
categorical analytic bound variational autoencoder (ReCAB-VAE) that
successfully models both continuous and relaxed discrete latent
representations. We implement an emotional text-to-speech synthesis system
based on the proposed framework, and show that the proposed system flexibly and
stably controls emotion expressions with better speech quality compared to
baselines that use stochastic estimation or categorical distribution
approximation.
Related papers
- Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Efficient Marginalization of Discrete and Structured Latent Variables
via Sparsity [26.518803984578867]
Training neural network models with discrete (categorical or structured) latent variables can be computationally challenging.
One typically resorts to sampling-based approximations of the true marginal.
We propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.
arXiv Detail & Related papers (2020-07-03T19:36:35Z) - Reliable Categorical Variational Inference with Mixture of Discrete
Normalizing Flows [10.406659081400354]
Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling.
Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations.
In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one.
arXiv Detail & Related papers (2020-06-28T10:39:39Z) - Generalized Gumbel-Softmax Gradient Estimator for Various Discrete
Random Variables [16.643346012854156]
Esting the gradients of nodes is one of the crucial research questions in the deep generative modeling community.
This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation.
arXiv Detail & Related papers (2020-03-04T01:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.