Related papers: GFlowNet-EM for learning compositional latent variable models

GFlowNet-EM for learning compositional latent variable models

URL: http://arxiv.org/abs/2302.06576v2
Date: Sat, 3 Jun 2023 18:02:08 GMT
Title: GFlowNet-EM for learning compositional latent variable models
Authors: Edward J. Hu, Nikolay Malkin, Moksh Jain, Katie Everett, Alexandros Graikos, Yoshua Bengio
Abstract summary: A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization. We propose the use of GFlowNets, algorithms for sampling from an unnormalized density. By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
Score: 115.96660869630227
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Latent variable models (LVMs) with discrete compositional latents are an important but challenging setting due to a combinatorially large number of possible configurations of the latents. A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization. For algorithms based on expectation-maximization (EM), the E-step is often intractable without restrictive approximations to the posterior. We propose the use of GFlowNets, algorithms for sampling from an unnormalized density by learning a stochastic policy for sequential construction of samples, for this intractable E-step. By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational inference algorithms for complex distributions over discrete structures. Our approach, GFlowNet-EM, enables the training of expressive LVMs with discrete compositional latents, as shown by experiments on non-context-free grammar induction and on images using discrete variational autoencoders (VAEs) without conditional independence enforced in the encoder.

Related papers

Multimodal Latent Language Modeling with Next-Token Diffusion [111.93906046452125]
Multimodal generative models require a unified approach to handle both discrete data (e.g., text and code) and continuous data (e.g., image, audio, video) We propose Latent Language Modeling (LatentLM), which seamlessly integrates continuous and discrete data using causal Transformers.
arXiv Detail & Related papers (2024-12-11T18:57:32Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
SIReN-VAE: Leveraging Flows and Amortized Inference for Bayesian Networks [2.8597160727750564]
This work explores incorporating arbitrary dependency structures, as specified by Bayesian networks, into VAEs. This is achieved by extending both the prior and inference network with graphical residual flows. We compare our model's performance on several synthetic datasets and show its potential in data-sparse settings.
arXiv Detail & Related papers (2022-04-23T10:31:08Z)
Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents [0.0]
We show that it is possible to train Variational Autoencoders (VAEs) with discrete latents without sampling-based approximation and re parameterization. In contrast to large supervised networks, the here investigated VAEs can, e.g., denoise a single image without previous training on clean data and/or training on large image datasets.
arXiv Detail & Related papers (2020-11-27T12:42:12Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
Relaxed-Responsibility Hierarchical Discrete VAEs [3.976291254896486]
We introduce textitRelaxed-Responsibility Vector-Quantisation, a novel way to parameterise discrete latent variables. We achieve state-of-the-art bits-per-dim results for various standard datasets.
arXiv Detail & Related papers (2020-07-14T19:10:05Z)
Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity [26.518803984578867]
Training neural network models with discrete (categorical or structured) latent variables can be computationally challenging. One typically resorts to sampling-based approximations of the true marginal. We propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.
arXiv Detail & Related papers (2020-07-03T19:36:35Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.