A Variational Perspective on Generative Flow Networks
- URL: http://arxiv.org/abs/2210.07992v1
- Date: Fri, 14 Oct 2022 17:45:59 GMT
- Title: A Variational Perspective on Generative Flow Networks
- Authors: Heiko Zimmermann, Fredrik Lindsten, Jan-Willem van de Meent, Christian
A. Naesseth
- Abstract summary: Generative flow networks (GFNs) are models for sequential sampling of composite objects.
We define variational objectives for GFNs in terms of the Kullback-Leibler (KL) divergences between the forward and backward distribution.
- Score: 21.97829447881589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative flow networks (GFNs) are a class of models for sequential sampling
of composite objects, which approximate a target distribution that is defined
in terms of an energy function or a reward. GFNs are typically trained using a
flow matching or trajectory balance objective, which matches forward and
backward transition models over trajectories. In this work, we define
variational objectives for GFNs in terms of the Kullback-Leibler (KL)
divergences between the forward and backward distribution. We show that
variational inference in GFNs is equivalent to minimizing the trajectory
balance objective when sampling trajectories from the forward model. We
generalize this approach by optimizing a convex combination of the reverse- and
forward KL divergence. This insight suggests variational inference methods can
serve as a means to define a more general family of objectives for training
generative flow networks, for example by incorporating control variates, which
are commonly used in variational inference, to reduce the variance of the
gradients of the trajectory balance objective. We evaluate our findings and the
performance of the proposed variational objective numerically by comparing it
to the trajectory balance objective on two synthetic tasks.
Related papers
- Training Neural Samplers with Reverse Diffusive KL Divergence [36.549460449020906]
Training generative models to sample from unnormalized density functions is an important and challenging task in machine learning.
Traditional training methods often rely on the reverse Kullback-Leibler (KL) divergence due to its tractability.
We propose to minimize the reverse KL along diffusion trajectories of both model and target densities.
We demonstrate that our method enhances sampling performance across various Boltzmann distributions.
arXiv Detail & Related papers (2024-10-16T11:08:02Z) - On Divergence Measures for Training GFlowNets [3.7277730514654555]
Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distributions over composable objects.
Traditionally, the training procedure for GFlowNets seeks to minimize the expected log-squared difference between a proposal (forward policy) and a target (backward policy) distribution.
We review four divergence measures, namely, Renyi-$alpha$'s, Tsallis-$alpha$'s, reverse and forward KL's, and design statistically efficient estimators for their gradients in the context of training GFlowNets
arXiv Detail & Related papers (2024-10-12T03:46:52Z) - NETS: A Non-Equilibrium Transport Sampler [15.58993313831079]
We propose an algorithm, termed the Non-Equilibrium Transport Sampler (NETS)
NETS can be viewed as a variant of importance sampling (AIS) based on Jarzynski's equality.
We show that this drift is the minimizer of a variety of objective functions, which can all be estimated in an unbiased fashion.
arXiv Detail & Related papers (2024-10-03T17:35:38Z) - Hallmarks of Optimization Trajectories in Neural Networks: Directional Exploration and Redundancy [75.15685966213832]
We analyze the rich directional structure of optimization trajectories represented by their pointwise parameters.
We show that training only scalar batchnorm parameters some while into training matches the performance of training the entire network.
arXiv Detail & Related papers (2024-03-12T07:32:47Z) - Variational Density Propagation Continual Learning [0.0]
Deep Neural Networks (DNNs) deployed to the real world are regularly subject to out-of-distribution (OoD) data.
This paper proposes a framework for adapting to data distribution drift modeled by benchmark Continual Learning datasets.
arXiv Detail & Related papers (2023-08-22T21:51:39Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Learning GFlowNets from partial episodes for improved convergence and
stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density.
Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory.
Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z) - Trajectory balance: Improved credit assignment in GFlowNets [63.687669765579585]
We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, to be prone to inefficient credit propagation across long action sequences.
We propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives.
In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.
arXiv Detail & Related papers (2022-01-31T14:07:49Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Tackling the Objective Inconsistency Problem in Heterogeneous Federated
Optimization [93.78811018928583]
This paper provides a framework to analyze the convergence of federated heterogeneous optimization algorithms.
We propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.
arXiv Detail & Related papers (2020-07-15T05:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.