MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets
- URL: http://arxiv.org/abs/2408.15905v2
- Date: Sun, 02 Mar 2025 20:30:28 GMT
- Title: MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets
- Authors: Dominic Phillips, Flaviu Cipcigan,
- Abstract summary: We introduce Adapted Metadynamics, a variant of metadynamics that can be applied to arbitrary black-box reward functions on continuous domains.<n>We show several continuous domains where the resulting algorithm, MetaGFN, accelerates convergence to the target distribution and discovers more distant reward modes than previous off-policy exploration strategies used for GFlowNets.
- Score: 1.892757060653176
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Flow Networks (GFlowNets) are a class of generative models that sample objects in proportion to a specified reward function through a learned policy. They can be trained either on-policy or off-policy, needing a balance between exploration and exploitation for fast convergence to a target distribution. While exploration strategies for discrete GFlowNets have been studied, exploration in the continuous case remains to be investigated, despite the potential for novel exploration algorithms due to the local connectedness of continuous domains. Here, we introduce Adapted Metadynamics, a variant of metadynamics that can be applied to arbitrary black-box reward functions on continuous domains. We use Adapted Metadynamics as an exploration strategy for continuous GFlowNets. We show several continuous domains where the resulting algorithm, MetaGFN, accelerates convergence to the target distribution and discovers more distant reward modes than previous off-policy exploration strategies used for GFlowNets.
Related papers
- Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets [0.7461398179934781]
Loss-Guided GFlowNets (LGGFN) is a novel approach where an auxiliary GFlowNet's exploration is driven by the main GFlowNet's training loss.<n>LGGFN consistently enhances exploration efficiency and sample diversity compared to baselines.
arXiv Detail & Related papers (2025-05-21T08:27:10Z) - On Generalization for Generative Flow Networks [54.20924253330039]
Generative Flow Networks (GFlowNets) have emerged as an innovative learning paradigm designed to address the challenge of sampling from an unnormalized probability distribution.
This paper attempts to formalize generalization in the context of GFlowNets, to link generalization with stability, and also to design experiments that assess the capacity of these models to uncover unseen parts of the reward function.
arXiv Detail & Related papers (2024-07-03T13:42:21Z) - Pessimistic Backward Policy for GFlowNets [40.00805723326561]
We study Generative Flow Networks (GFlowNets), which learn to sample objects proportionally to a given reward function.
In this work, we observe that GFlowNets tend to under-exploit the high-reward objects due to training on insufficient number of trajectories.
We propose a pessimistic backward policy for GFlowNets, which maximizes the observed flow to align closely with the true reward for the object.
arXiv Detail & Related papers (2024-05-25T02:30:46Z) - Thompson sampling for improved exploration in GFlowNets [75.89693358516944]
Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy.
We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work.
arXiv Detail & Related papers (2023-06-30T14:19:44Z) - Meta Generative Flow Networks with Personalization for Task-Specific
Adaptation [8.830531142309733]
Multi-task reinforcement learning and meta-reinforcement learning tend to focus on tasks with higher rewards and more frequent occurrences.
GFlowNets can be integrated into meta-learning algorithms (GFlowMeta) by leveraging the advantages of GFlowNets on tasks with sparse rewards.
This paper proposes a personalized approach named pGFlowMeta, which combines task-specific personalized policies with a meta policy.
arXiv Detail & Related papers (2023-06-16T10:18:38Z) - Stochastic Generative Flow Networks [89.34644133901647]
Generative Flow Networks (or GFlowNets) learn to sample complex structures through the lens of "inference as control"
Existing GFlowNets can be applied only to deterministic environments, and fail in more general tasks with dynamics.
This paper introduces GFlowNets, a new algorithm that extends GFlowNets to environments.
arXiv Detail & Related papers (2023-02-19T03:19:40Z) - Generative Augmented Flow Networks [88.50647244459009]
We propose Generative Augmented Flow Networks (GAFlowNets) to incorporate intermediate rewards into GFlowNets.
GAFlowNets can leverage edge-based and state-based intrinsic rewards in a joint way to improve exploration.
arXiv Detail & Related papers (2022-10-07T03:33:56Z) - Learning GFlowNets from partial episodes for improved convergence and
stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density.
Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory.
Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z) - Trajectory balance: Improved credit assignment in GFlowNets [63.687669765579585]
We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, to be prone to inefficient credit propagation across long action sequences.
We propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives.
In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.
arXiv Detail & Related papers (2022-01-31T14:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.