Masked Diffusion Models as Energy Minimization
- URL: http://arxiv.org/abs/2509.13866v1
- Date: Wed, 17 Sep 2025 09:57:31 GMT
- Title: Masked Diffusion Models as Energy Minimization
- Authors: Sitong Chen, Shen Nie, Jiacheng Sun, Zijin Feng, Zhenguo Li, Ji-Rong Wen, Chongxuan Li,
- Abstract summary: Masked diffusion models (MDMs) are solutions to energy problems in discrete optimal transport.<n>We prove that three distinct energy formulations--kinetic, conditional kinetic, and geodesic energy--are mathematically equivalent under the structure of MDMs.<n>This unification not only clarifies the theoretical foundations of MDMs, but also motivates practical improvements in sampling.
- Score: 102.84400389614262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a systematic theoretical framework that interprets masked diffusion models (MDMs) as solutions to energy minimization problems in discrete optimal transport. Specifically, we prove that three distinct energy formulations--kinetic, conditional kinetic, and geodesic energy--are mathematically equivalent under the structure of MDMs, and that MDMs minimize all three when the mask schedule satisfies a closed-form optimality condition. This unification not only clarifies the theoretical foundations of MDMs, but also motivates practical improvements in sampling. By parameterizing interpolation schedules via Beta distributions, we reduce the schedule design space to a tractable 2D search, enabling efficient post-training tuning without model modification. Experiments on synthetic and real-world benchmarks demonstrate that our energy-inspired schedules outperform hand-crafted baselines, particularly in low-step sampling settings.
Related papers
- Low-Bit, High-Fidelity: Optimal Transport Quantization for Flow Matching [0.0]
Flow Matching (FM) generative models offer efficient simulation-free training and deterministic sampling, but their practical deployment is challenged by high-precision parameter requirements.<n>We adapt optimal transport (OT)-based post-training quantization to FM models, minimizing the 2-Wasserstein distance between quantized and original weights, and systematically compare its effectiveness against uniform, piecewise, and logarithmic quantization schemes.
arXiv Detail & Related papers (2025-11-14T15:49:36Z) - VEDA: 3D Molecular Generation via Variance-Exploding Diffusion with Annealing [4.288647933894182]
VEDA is a framework that combines variance-exploding diffusion with annealing to generate 3D structures.<n>On the QM9 and GEOM-DRUGS datasets, VEDA matches the sampling efficiency of flow-based models.<n>VEDA's generated structures are remarkably stable, as measured by their relaxation energy.
arXiv Detail & Related papers (2025-11-11T05:45:37Z) - Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models [52.74448905289362]
EqM is a generative modeling framework built from an equilibrium dynamics perspective.<n>By replacing time-conditional velocities with a unified equilibrium landscape, EqM offers a tighter bridge between flow and energy-based models.
arXiv Detail & Related papers (2025-10-02T17:59:06Z) - PHASE-Net: Physics-Grounded Harmonic Attention System for Efficient Remote Photoplethysmography Measurement [63.007237197267834]
Existing deep learning methods are mostly physiological monitoring and lack theoretical robustness.<n>We propose a physics-informed r paradigm derived from the Navier-Stokes equations of hemodynamics, showing that the pulse signal follows a second-order system.<n>This provides a theoretical justification for using a Temporal Conal Network (TCN)<n>Phase-Net achieves state-of-the-art performance with strong efficiency, offering a theoretically grounded and deployment-ready r solution.
arXiv Detail & Related papers (2025-09-29T14:36:45Z) - Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling [42.79674268979455]
Energy-Weighted Flow Matching is a novel training objective enabling continuous normalizing flows to model Boltzmann distributions.<n>Our algorithms demonstrate sample quality competitive with state-of-the-art energy-only methods.
arXiv Detail & Related papers (2025-09-03T21:16:03Z) - Flow Matching Meets PDEs: A Unified Framework for Physics-Constrained Generation [21.321570407292263]
We propose Physics-Based Flow Matching, a generative framework that embeds physical constraints, both PDE residuals and algebraic relations, into the flow matching objective.<n>We show that our approach yields up to an $8times$ more accurate physical residuals compared to FM, while clearly outperforming existing algorithms in terms of distributional accuracy.
arXiv Detail & Related papers (2025-06-10T09:13:37Z) - Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling [47.82616476928464]
Masked diffusion models (MDMs) have emerged as a popular research topic for generative modeling of discrete data.<n>We show that both training and sampling of MDMs are theoretically free from the time variable.<n>We identify, for the first time, an underlying numerical issue, even with the commonly used 32-bit floating-point precision.
arXiv Detail & Related papers (2024-09-04T17:48:19Z) - QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we identify imbalanced activation distributions as a primary source of quantization difficulty.<n>We propose to adjust these distributions through weight finetuning to be more quantization-friendly.<n>Our method demonstrates its efficacy across three high-resolution image generation tasks.
arXiv Detail & Related papers (2024-02-06T03:39:44Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - MCMC-Correction of Score-Based Diffusion Models for Model Composition [2.682859657520006]
Diffusion models can be parameterized in terms of a score or an energy function.<n>We introduce a novel MH-like acceptance rule based on line integration of the score function.
arXiv Detail & Related papers (2023-07-26T07:50:41Z) - 3D wind field profiles from hyperspectral sounders: revisiting
optic-flow from a meteorological perspective [0.0]
We present an efficient optic flow algorithm for the extraction of vertically resolved 3D atmospheric motion vector (AMV) data measures by Forecast sounders.
We show that the proposed recursion is superior to state-of-the-art optical flow algorithms in the real atmospheric sounding inter-II observations.
arXiv Detail & Related papers (2023-03-09T10:14:25Z) - Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion
Probabilistic Models [58.357180353368896]
We propose a conditional paradigm that benefits from the denoising diffusion probabilistic model (DDPM) to tackle the problem of realistic and diverse action-conditioned 3D skeleton-based motion generation.
We are a pioneering attempt that uses DDPM to synthesize a variable number of motion sequences conditioned on a categorical action.
arXiv Detail & Related papers (2023-01-10T13:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.