VAE-KRnet and its applications to variational Bayes
- URL: http://arxiv.org/abs/2006.16431v2
- Date: Sat, 11 Dec 2021 20:48:32 GMT
- Title: VAE-KRnet and its applications to variational Bayes
- Authors: Xiaoliang Wan, Shuangqing Wei
- Abstract summary: We have proposed a generative model, called VAE-KRnet, for density estimation or approximation.
VAE is used a dimension reduction technique to capture the latent space, and KRnet is used to model the distribution of the latent variable.
VAE-KRnet can be used as a density model to approximate either data distribution or an arbitrary probability density function.
- Score: 4.9545850065593875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we have proposed a generative model, called VAE-KRnet, for
density estimation or approximation, which combines the canonical variational
autoencoder (VAE) with our recently developed flow-based generative model,
called KRnet. VAE is used as a dimension reduction technique to capture the
latent space, and KRnet is used to model the distribution of the latent
variable. Using a linear model between the data and the latent variable, we
show that VAE-KRnet can be more effective and robust than the canonical VAE.
VAE-KRnet can be used as a density model to approximate either data
distribution or an arbitrary probability density function (PDF) known up to a
constant. VAE-KRnet is flexible in terms of dimensionality. When the number of
dimensions is relatively small, KRnet can effectively approximate the
distribution in terms of the original random variable. For high-dimensional
cases, we may use VAE-KRnet to incorporate dimension reduction. One important
application of VAE-KRnet is the variational Bayes for the approximation of the
posterior distribution. The variational Bayes approaches are usually based on
the minimization of the Kullback-Leibler (KL) divergence between the model and
the posterior. For high-dimensional distributions, it is very challenging to
construct an accurate density model due to the curse of dimensionality, where
extra assumptions are often introduced for efficiency. For instance, the
classical mean-field approach assumes mutual independence between dimensions,
which often yields an underestimated variance due to oversimplification. To
alleviate this issue, we include into the loss the maximization of the mutual
information between the latent random variable and the original random
variable, which helps keep more information from the region of low density such
that the estimation of variance is improved.
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - Variational autoencoder with weighted samples for high-dimensional
non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function.
In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution.
We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z) - Distributed Variational Inference for Online Supervised Learning [15.038649101409804]
This paper develops a scalable distributed probabilistic inference algorithm.
It applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks.
arXiv Detail & Related papers (2023-09-05T22:33:02Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - Addressing Variance Shrinkage in Variational Autoencoders using Quantile
Regression [0.0]
Probable Variational AutoEncoder (VAE) has become a popular model for anomaly detection in applications such as lesion detection in medical images.
We describe an alternative approach that avoids the well-known problem of shrinkage or underestimation of variance.
Using estimated quantiles to compute mean and variance under the Gaussian assumption, we compute reconstruction probability as a principled approach to outlier or anomaly detection.
arXiv Detail & Related papers (2020-10-18T17:37:39Z) - Variational Hyper-Encoding Networks [62.74164588885455]
We propose a framework called HyperVAE for encoding distributions of neural network parameters theta.
We predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(theta)
arXiv Detail & Related papers (2020-05-18T06:46:09Z) - A Batch Normalized Inference Network Keeps the KL Vanishing Away [35.40781000297285]
Variational Autoencoder (VAE) is widely used to approximate a model's posterior on latent variables.
VAE often converges to a degenerated local optimum known as "posterior collapse"
arXiv Detail & Related papers (2020-04-27T05:20:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.