Energy-Weighted Flow Matching for Offline Reinforcement Learning
- URL: http://arxiv.org/abs/2503.04975v1
- Date: Thu, 06 Mar 2025 21:10:12 GMT
- Title: Energy-Weighted Flow Matching for Offline Reinforcement Learning
- Authors: Shiyuan Zhang, Weitong Zhang, Quanquan Gu,
- Abstract summary: This paper investigates energy guidance in generative modeling, where the target distribution is defined as $q(mathbf x) propto p(mathbf x)exp(-beta mathcal E(mathcal x))$, with $p(mathbf x)$ being the data distribution and $mathcal E(mathcal x)$ as the energy function.<n>We introduce energy-weighted flow matching (EFM), a method that directly learns the energy-guided flow without the need for auxiliary models.<n>We extend this methodology to energy-weighted
- Score: 53.64306385597818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates energy guidance in generative modeling, where the target distribution is defined as $q(\mathbf x) \propto p(\mathbf x)\exp(-\beta \mathcal E(\mathbf x))$, with $p(\mathbf x)$ being the data distribution and $\mathcal E(\mathcal x)$ as the energy function. To comply with energy guidance, existing methods often require auxiliary procedures to learn intermediate guidance during the diffusion process. To overcome this limitation, we explore energy-guided flow matching, a generalized form of the diffusion process. We introduce energy-weighted flow matching (EFM), a method that directly learns the energy-guided flow without the need for auxiliary models. Theoretical analysis shows that energy-weighted flow matching accurately captures the guided flow. Additionally, we extend this methodology to energy-weighted diffusion models and apply it to offline reinforcement learning (RL) by proposing the Q-weighted Iterative Policy Optimization (QIPO). Empirically, we demonstrate that the proposed QIPO algorithm improves performance in offline RL tasks. Notably, our algorithm is the first energy-guided diffusion model that operates independently of auxiliary models and the first exact energy-guided flow matching model in the literature.
Related papers
- Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching [33.9461078261722]
We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities.
We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates.
We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models.
arXiv Detail & Related papers (2025-04-16T02:20:06Z) - Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling [4.584647857042494]
Generative models often map noise to data by matching flows or scores, but these approaches become cumbersome for incorporating partial observations or additional priors.
Inspired by recent advances in Wasserstein gradient flows, we propose Energy Matching, a framework that unifies flow-based approaches with the flexibility of energy-based models (EBMs)
We parameterize this dynamic with a single time-independent scalar field, which serves as both a powerful generator and a flexible prior for effective regularization of inverse problems.
arXiv Detail & Related papers (2025-04-14T18:10:58Z) - Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space [72.52365911990935]
We introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling.
Our results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks.
arXiv Detail & Related papers (2024-10-02T17:53:23Z) - Iterated Energy-based Flow Matching for Sampling from Boltzmann Densities [11.850515912491657]
We propose iterated energy-based flow matching (iEFM) to train continuous normalizing flow (CNF) models from unnormalized densities.
Our results demonstrate that iEFM outperforms existing methods, showcasing its potential for efficient and scalable probabilistic modeling.
arXiv Detail & Related papers (2024-08-29T04:06:34Z) - Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review [63.31328039424469]
This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions.
We explain the application of various RL algorithms, including PPO, differentiable optimization, reward-weighted MLE, value-weighted sampling, and path consistency learning.
arXiv Detail & Related papers (2024-07-18T17:35:32Z) - Explicit Flow Matching: On The Theory of Flow Matching Algorithms with Applications [3.5409403011214295]
This paper proposes a novel method, Explicit Flow Matching (ExFM), for training and analyzing flow-based generative models.
ExFM leverages a theoretically grounded loss function, ExFM loss, to demonstrably reduce variance during training, leading to faster convergence and more stable learning.
arXiv Detail & Related papers (2024-02-05T17:45:12Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling
in Offline Reinforcement Learning [44.880922634512096]
This paper considers a general setting where the guidance is defined by an (unnormalized) energy function.
The main challenge for this setting is that the intermediate guidance during the diffusion sampling procedure is unknown and is hard to estimate.
We propose an exact formulation of the intermediate guidance as well as a novel training objective named contrastive energy prediction (CEP) to learn the exact guidance.
arXiv Detail & Related papers (2023-04-25T13:50:41Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Particle Dynamics for Learning EBMs [83.59335980576637]
Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model.
The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration.
This paper proposes an alternative approach to getting these samples and avoiding crude MCMC sampling from the current model.
arXiv Detail & Related papers (2021-11-26T23:41:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.