CFlowNets: Continuous Control with Generative Flow Networks
- URL: http://arxiv.org/abs/2303.02430v1
- Date: Sat, 4 Mar 2023 14:37:47 GMT
- Title: CFlowNets: Continuous Control with Generative Flow Networks
- Authors: Yinchuan Li, Shuang Luo, Haozhi Wang and Jianye Hao
- Abstract summary: Generative flow networks (GFlowNets) can be used as an alternative to reinforcement learning for exploratory control tasks.
We propose generative continuous flow networks (CFlowNets) that can be applied to continuous control tasks.
- Score: 23.093316128475564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative flow networks (GFlowNets), as an emerging technique, can be used
as an alternative to reinforcement learning for exploratory control tasks.
GFlowNet aims to generate distribution proportional to the rewards over
terminating states, and to sample different candidates in an active learning
fashion. GFlowNets need to form a DAG and compute the flow matching loss by
traversing the inflows and outflows of each node in the trajectory. No
experiments have yet concluded that GFlowNets can be used to handle continuous
tasks. In this paper, we propose generative continuous flow networks
(CFlowNets) that can be applied to continuous control tasks. First, we present
the theoretical formulation of CFlowNets. Then, a training framework for
CFlowNets is proposed, including the action selection process, the flow
approximation algorithm, and the continuous flow matching loss function.
Afterward, we theoretically prove the error bound of the flow approximation.
The error decreases rapidly as the number of flow samples increases. Finally,
experimental results on continuous control tasks demonstrate the performance
advantages of CFlowNets compared to many reinforcement learning methods,
especially regarding exploration ability.
Related papers
- Improving GFlowNets with Monte Carlo Tree Search [6.497027864860203]
Recent studies have revealed strong connections between GFlowNets and entropy-regularized reinforcement learning.
We propose to enhance planning capabilities of GFlowNets by applying Monte Carlo Tree Search (MCTS)
Our experiments demonstrate that this approach improves the sample efficiency of GFlowNet training and the generation fidelity of pre-trained GFlowNet models.
arXiv Detail & Related papers (2024-06-19T15:58:35Z) - Evolution Guided Generative Flow Networks [11.609895436955242]
Generative Flow Networks (GFlowNets) learn to sample compositional objects proportional to their rewards.
One big challenge of GFlowNets is training them effectively when dealing with long time horizons and sparse rewards.
We propose Evolution guided generative flow networks (EGFN), a simple but powerful augmentation to the GFlowNets training using Evolutionary algorithms (EA)
arXiv Detail & Related papers (2024-02-03T15:28:53Z) - Pre-Training and Fine-Tuning Generative Flow Networks [61.90529626590415]
We introduce a novel approach for reward-free pre-training of GFlowNets.
By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet that learns to explore the candidate space.
We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks.
arXiv Detail & Related papers (2023-10-05T09:53:22Z) - Expected flow networks in stochastic environments and two-player zero-sum games [63.98522423072093]
Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution.
We propose expected flow networks (EFlowNets) which extend GFlowNets to environments.
We show that EFlowNets outperform other GFlowNet formulations in tasks such as protein design.
We then extend the concept of EFlowNets to adversarial environments, proposing adversarial flow networks (AFlowNets) for two-player zero-sum games.
arXiv Detail & Related papers (2023-10-04T12:50:29Z) - Stochastic Generative Flow Networks [89.34644133901647]
Generative Flow Networks (or GFlowNets) learn to sample complex structures through the lens of "inference as control"
Existing GFlowNets can be applied only to deterministic environments, and fail in more general tasks with dynamics.
This paper introduces GFlowNets, a new algorithm that extends GFlowNets to environments.
arXiv Detail & Related papers (2023-02-19T03:19:40Z) - Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps.
In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training.
Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z) - A theory of continuous generative flow networks [104.93913776866195]
Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions.
We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous or hybrid state spaces.
arXiv Detail & Related papers (2023-01-30T00:37:56Z) - Learning GFlowNets from partial episodes for improved convergence and
stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density.
Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory.
Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.