Related papers: SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian Networks

SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian Networks

URL: http://arxiv.org/abs/2204.11847v1
Date: Sat, 23 Apr 2022 10:31:08 GMT
Title: SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian Networks
Authors: Jacobie Mouton and Steve Kroon
Abstract summary: This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs. This is achieved by extending both the prior and inference network with graphical residual flows. We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.
Score: 2.8597160727750564
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Initial work on variational autoencoders assumed independent latent variables with simple distributions. Subsequent work has explored incorporating more complex distributions and dependency structures: including normalizing flows in the encoder network allows latent variables to entangle non-linearly, creating a richer class of distributions for the approximate posterior, and stacking layers of latent variables allows more complex priors to be specified for the generative model. This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs. This is achieved by extending both the prior and inference network with graphical residual flows - residual flows that encode conditional independence by masking the weight matrices of the flow's residual blocks. We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.

Related papers

Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC. We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z)
Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components. Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Bayesian Flow Networks [4.585102332532472]
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference. Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models. BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.
arXiv Detail & Related papers (2023-08-14T09:56:35Z)
Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network. We use a single GFlowNet whose sampling policy follows a two-phase process. Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z)
Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space. We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z)
GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization. We propose the use of GFlowNets, algorithms for sampling from an unnormalized density. By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z)
Understanding the Covariance Structure of Convolutional Filters [86.0964031294896]
Recent ViT-inspired convolutional networks such as ConvMixer and ConvNeXt use large-kernel depthwise convolutions with notable structure. We first observe that such learned filters have highly-structured covariance matrices, and we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks.
arXiv Detail & Related papers (2022-10-07T15:59:13Z)
Self-Reflective Variational Autoencoder [21.054722609128525]
Variational Autoencoder (VAE) is a powerful framework for learning latent variable generative models. We introduce a solution, which we call self-reflective inference. We empirically demonstrate the clear advantages of matching the variational posterior to the exact posterior.
arXiv Detail & Related papers (2020-07-10T05:05:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.