AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
- URL: http://arxiv.org/abs/2402.04292v1
- Date: Tue, 6 Feb 2024 10:15:38 GMT
- Title: AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
- Authors: Xixi Hu, Bo Liu, Xingchao Liu and Qiang Liu
- Abstract summary: We propose AdaFlow, an imitation learning framework based on flow-based generative modeling.
AdaFlow represents the policy with state-conditioned ordinary differential equations (ODEs)
We show that AdaFlow achieves high performance across all dimensions, including success rate, behavioral diversity, and inference speed.
- Score: 22.967735080818006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion-based imitation learning improves Behavioral Cloning (BC) on
multi-modal decision-making, but comes at the cost of significantly slower
inference due to the recursion in the diffusion process. It urges us to design
efficient policy generators while keeping the ability to generate diverse
actions. To address this challenge, we propose AdaFlow, an imitation learning
framework based on flow-based generative modeling. AdaFlow represents the
policy with state-conditioned ordinary differential equations (ODEs), which are
known as probability flows. We reveal an intriguing connection between the
conditional variance of their training loss and the discretization error of the
ODEs. With this insight, we propose a variance-adaptive ODE solver that can
adjust its step size in the inference stage, making AdaFlow an adaptive
decision-maker, offering rapid inference without sacrificing diversity.
Interestingly, it automatically reduces to a one-step generator when the action
distribution is uni-modal. Our comprehensive empirical evaluation shows that
AdaFlow achieves high performance across all dimensions, including success
rate, behavioral diversity, and inference speed. The code is available at
https://github.com/hxixixh/AdaFlow
Related papers
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator [73.80050807279461]
Piecewise Rectified Flow (PeRFlow) is a flow-based method for accelerating diffusion models.
PeRFlow achieves superior performance in a few-step generation.
arXiv Detail & Related papers (2024-05-13T07:10:53Z) - Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours)
Our method is based on the reformulation of the standard probabilistic flow models.
Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z) - D-Flow: Differentiating through Flows for Controlled Generation [37.80603174399585]
We introduce D-Flow, a framework for controlling the generation process by differentiating through the flow.
We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold.
We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.
arXiv Detail & Related papers (2024-02-21T18:56:03Z) - FlowPG: Action-constrained Policy Gradient with Normalizing Flows [14.98383953401637]
Action-constrained reinforcement learning (ACRL) is a popular approach for solving safety-critical resource-alential related decision making problems.
A major challenge in ACRL is to ensure agent taking a valid action satisfying constraints in each step.
arXiv Detail & Related papers (2024-02-07T11:11:46Z) - Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs [21.08236758778604]
We propose several improved techniques for maximum likelihood estimation for diffusion ODEs.
For training, we propose velocity parameterization and explore variance reduction techniques for faster convergence.
For evaluation, we propose a novel training-free truncated-normal dequantization to fill the training-evaluation gap commonly existing in diffusion ODEs.
arXiv Detail & Related papers (2023-05-06T05:21:24Z) - Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps.
In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training.
Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z) - Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion
Models [54.1843419649895]
We propose a solution based on denoising diffusion probabilistic models (DDPMs)
Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models.
Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task.
arXiv Detail & Related papers (2022-12-01T18:59:55Z) - Flow Network based Generative Models for Non-Iterative Diverse Candidate
Generation [110.09855163856326]
This paper is about the problem of learning a policy for generating an object from a sequence of actions.
We propose GFlowNet, based on a view of the generative process as a flow network.
We prove that any global minimum of the proposed objectives yields a policy which samples from the desired distribution.
arXiv Detail & Related papers (2021-06-08T14:21:10Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.