Related papers: Avoid What You Know: Divergent Trajectory Balance for GFlowNets

Avoid What You Know: Divergent Trajectory Balance for GFlowNets

URL: http://arxiv.org/abs/2602.17827v1
Date: Thu, 19 Feb 2026 20:47:28 GMT
Title: Avoid What You Know: Divergent Trajectory Balance for GFlowNets
Authors: Pedro Dall'Antonia, Tiago da Silva, Daniel Csillag, Salem Lahlou, Diego Mesquita,
Abstract summary: We propose an exploration GFlowNet explicitly trained to search for high-reward states in regions underexplored by the canonical GFlowNet.<n>We show that ACE significantly improves upon prior work in terms of approximation accuracy to the target distribution and discovery rate of diverse high-reward states.
Score: 14.524997986396713
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative Flow Networks (GFlowNets) are a flexible family of amortized samplers trained to generate discrete and compositional objects with probability proportional to a reward function. However, learning efficiency is constrained by the model's ability to rapidly explore diverse high-probability regions during training. To mitigate this issue, recent works have focused on incentivizing the exploration of unvisited and valuable states via curiosity-driven search and self-supervised random network distillation, which tend to waste samples on already well-approximated regions of the state space. In this context, we propose Adaptive Complementary Exploration (ACE), a principled algorithm for the effective exploration of novel and high-probability regions when learning GFlowNets. To achieve this, ACE introduces an exploration GFlowNet explicitly trained to search for high-reward states in regions underexplored by the canonical GFlowNet, which learns to sample from the target distribution. Through extensive experiments, we show that ACE significantly improves upon prior work in terms of approximation accuracy to the target distribution and discovery rate of diverse high-reward states.

Related papers

MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control [19.49552596070782]
Generative Flow Networks (GFlowNets) have emerged as a powerful tool for generating diverse and high-reward structured objects by learning to sample from a distribution proportional to a given reward function.<n>In this work, we integrate an enhanced Monte Carlo Tree Search (MCTS) into the GFlowNets sampling process to balance exploration and exploitation adaptively.<n>Our method can not only accelerate the speed of discovering high-reward regions but also continuously generate high-reward samples, while preserving the diversity of the generative distribution.
arXiv Detail & Related papers (2025-10-01T12:09:04Z)
Improved Exploration in GFlownets via Enhanced Epistemic Neural Networks [3.754610894453276]
Efficiently identifying the right trajectories for training remains an open problem in GFlowNets.<n>Our proposed algorithm, ENN-GFN-Enhanced, is compared to the baseline method in GFlownets and evaluated in grid environments.
arXiv Detail & Related papers (2025-06-19T13:39:30Z)
Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets [22.653875450786444]
Loss-Guided GFlowNets (LGGFN) is a novel approach where an auxiliary GFlowNet's exploration is textbfdirectly driven by the main GFlowNet's training loss<n>This targeted exploration significantly accelerates the discovery of diverse, high-reward samples.
arXiv Detail & Related papers (2025-05-21T08:27:10Z)
Adaptive teachers for amortized samplers [76.88721198565861]
We propose an adaptive training distribution (the teacher) to guide the training of the primary amortized sampler (the student)<n>We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge.
arXiv Detail & Related papers (2024-10-02T11:33:13Z)
On Generalization for Generative Flow Networks [54.20924253330039]
Generative Flow Networks (GFlowNets) have emerged as an innovative learning paradigm designed to address the challenge of sampling from an unnormalized probability distribution. This paper attempts to formalize generalization in the context of GFlowNets, to link generalization with stability, and also to design experiments that assess the capacity of these models to uncover unseen parts of the reward function.
arXiv Detail & Related papers (2024-07-03T13:42:21Z)
Pre-Training and Fine-Tuning Generative Flow Networks [61.90529626590415]
We introduce a novel approach for reward-free pre-training of GFlowNets. By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet that learns to explore the candidate space. We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks.
arXiv Detail & Related papers (2023-10-05T09:53:22Z)
Local Search GFlowNets [85.0053493167887]
Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue.
arXiv Detail & Related papers (2023-10-04T10:27:17Z)
Generative Augmented Flow Networks [88.50647244459009]
We propose Generative Augmented Flow Networks (GAFlowNets) to incorporate intermediate rewards into GFlowNets. GAFlowNets can leverage edge-based and state-based intrinsic rewards in a joint way to improve exploration.
arXiv Detail & Related papers (2022-10-07T03:33:56Z)
Learning GFlowNets from partial episodes for improved convergence and stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density. Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory. Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.