SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian
Networks
- URL: http://arxiv.org/abs/2204.11847v1
- Date: Sat, 23 Apr 2022 10:31:08 GMT
- Title: SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian
Networks
- Authors: Jacobie Mouton and Steve Kroon
- Abstract summary: This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs.
This is achieved by extending both the prior and inference network with graphical residual flows.
We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.
- Score: 2.8597160727750564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Initial work on variational autoencoders assumed independent latent variables
with simple distributions. Subsequent work has explored incorporating more
complex distributions and dependency structures: including normalizing flows in
the encoder network allows latent variables to entangle non-linearly, creating
a richer class of distributions for the approximate posterior, and stacking
layers of latent variables allows more complex priors to be specified for the
generative model. This work explores incorporating arbitrary dependency
structures, as specified by Bayesian networks, into VAEs. This is achieved by
extending both the prior and inference network with graphical residual flows -
residual flows that encode conditional independence by masking the weight
matrices of the flow's residual blocks. We compare our model's performance on
several synthetic datasets and show its potential in data-sparse settings.
Related papers
- Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components.
Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Bayesian Flow Networks [4.585102332532472]
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference.
Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models.
BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.
arXiv Detail & Related papers (2023-08-14T09:56:35Z) - Joint Bayesian Inference of Graphical Structure and Parameters with a
Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network.
We use a single GFlowNet whose sampling policy follows a two-phase process.
Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization.
We propose the use of GFlowNets, algorithms for sampling from an unnormalized density.
By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z) - Understanding the Covariance Structure of Convolutional Filters [86.0964031294896]
Recent ViT-inspired convolutional networks such as ConvMixer and ConvNeXt use large-kernel depthwise convolutions with notable structure.
We first observe that such learned filters have highly-structured covariance matrices, and we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks.
arXiv Detail & Related papers (2022-10-07T15:59:13Z) - Self-Reflective Variational Autoencoder [21.054722609128525]
Variational Autoencoder (VAE) is a powerful framework for learning latent variable generative models.
We introduce a solution, which we call self-reflective inference.
We empirically demonstrate the clear advantages of matching the variational posterior to the exact posterior.
arXiv Detail & Related papers (2020-07-10T05:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.