Controlling Exploration-Exploitation in GFlowNets via Markov Chain Perspectives
- URL: http://arxiv.org/abs/2602.01749v2
- Date: Tue, 03 Feb 2026 06:13:51 GMT
- Title: Controlling Exploration-Exploitation in GFlowNets via Markov Chain Perspectives
- Authors: Lin Chen, Samuel Drapeau, Fanghao Shao, Xuekai Zhu, Bo Xue, Yunchong Song, Mathieu Laurière, Zhouhan Lin,
- Abstract summary: Generative Flow Network (GFlowNet) objectives implicitly fix an equal mixing of forward and backward policies.<n>We propose $$-GFNs, which generalize the mixing via a tunable parameter $$.<n>This enables direct control over exploration-exploitation dynamics to enhance mode discovery capabilities, while ensuring convergence to unique flows.
- Score: 23.095168795582126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Flow Network (GFlowNet) objectives implicitly fix an equal mixing of forward and backward policies, potentially constraining the exploration-exploitation trade-off during training. By further exploring the link between GFlowNets and Markov chains, we establish an equivalence between GFlowNet objectives and Markov chain reversibility, thereby revealing the origin of such constraints, and provide a framework for adapting Markov chain properties to GFlowNets. Building on these theoretical findings, we propose $α$-GFNs, which generalize the mixing via a tunable parameter $α$. This generalization enables direct control over exploration-exploitation dynamics to enhance mode discovery capabilities, while ensuring convergence to unique flows. Across various benchmarks, including Set, Bit Sequence, and Molecule Generation, $α$-GFN objectives consistently outperform previous GFlowNet objectives, achieving up to a $10 \times$ increase in the number of discovered modes.
Related papers
- A Theory of Multi-Agent Generative Flow Networks [65.53605277612444]
We propose a theoretical framework for multi-agent generative flow networks (MA-GFlowNets)<n>MA-GFlowNets can be applied to multiple agents to generate objects collaboratively through a series of joint actions.<n>Joint Flow training is based on a local-global principle allowing to train a collection of (local) GFN as a unique (global) GFN.
arXiv Detail & Related papers (2025-09-24T04:01:21Z) - Pessimistic Backward Policy for GFlowNets [40.00805723326561]
We study Generative Flow Networks (GFlowNets), which learn to sample objects proportionally to a given reward function.
In this work, we observe that GFlowNets tend to under-exploit the high-reward objects due to training on insufficient number of trajectories.
We propose a pessimistic backward policy for GFlowNets, which maximizes the observed flow to align closely with the true reward for the object.
arXiv Detail & Related papers (2024-05-25T02:30:46Z) - Generative Flow Networks: a Markov Chain Perspective [93.9910025411313]
We propose a new perspective for GFlowNets using Markov chains, showing a unifying view for GFlowNets regardless of the nature of the state space.
Positioning GFlowNets under the same theoretical framework as MCMC methods also allows us to identify the similarities between both frameworks.
arXiv Detail & Related papers (2023-07-04T01:28:02Z) - Stochastic Generative Flow Networks [89.34644133901647]
Generative Flow Networks (or GFlowNets) learn to sample complex structures through the lens of "inference as control"
Existing GFlowNets can be applied only to deterministic environments, and fail in more general tasks with dynamics.
This paper introduces GFlowNets, a new algorithm that extends GFlowNets to environments.
arXiv Detail & Related papers (2023-02-19T03:19:40Z) - A theory of continuous generative flow networks [104.93913776866195]
Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions.
We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous or hybrid state spaces.
arXiv Detail & Related papers (2023-01-30T00:37:56Z) - Learning GFlowNets from partial episodes for improved convergence and
stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density.
Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory.
Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z) - Trajectory balance: Improved credit assignment in GFlowNets [63.687669765579585]
We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, to be prone to inefficient credit propagation across long action sequences.
We propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives.
In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.
arXiv Detail & Related papers (2022-01-31T14:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.