Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models
- URL: http://arxiv.org/abs/2506.17139v2
- Date: Tue, 04 Nov 2025 12:36:30 GMT
- Title: Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models
- Authors: Michael Plainer, Hao Wu, Leon Klein, Stephan Günnemann, Frank Noé,
- Abstract summary: We propose an energy-based diffusion model with a Fokker--Planck-derived regularization term to enforce consistency.<n>We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins.
- Score: 50.77646970127369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, diffusion models trained on equilibrium molecular distributions have proven effective for sampling biomolecules. Beyond direct sampling, the score of such a model can also be used to derive the forces that act on molecular systems. However, while classical diffusion sampling usually recovers the training distribution, the corresponding energy-based interpretation of the learned score is often inconsistent with this distribution, even for low-dimensional toy systems. We trace this inconsistency to inaccuracies of the learned score at very small diffusion timesteps, where the model must capture the correct evolution of the data distribution. In this regime, diffusion models fail to satisfy the Fokker--Planck equation, which governs the evolution of the score. We interpret this deviation as one source of the observed inconsistencies and propose an energy-based diffusion model with a Fokker--Planck-derived regularization term to enforce consistency. We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins, and by introducing a state-of-the-art transferable Boltzmann emulator for dipeptides that supports simulation and achieves improved consistency and efficient sampling. Our code, model weights, and self-contained JAX and PyTorch notebooks are available at https://github.com/noegroup/ScoreMD.
Related papers
- Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models [12.081055615006305]
We introduce enhanced diffusion sampling, enabling efficient exploration of rare-event regions.<n>Our methods achieve fast, accurate, and scalable estimation of equilibrium properties within GPU-minutes to hours per system.
arXiv Detail & Related papers (2026-02-18T17:26:15Z) - Enhancing Diffusion-Based Sampling with Molecular Collective Variables [23.394068689634086]
Diffusion-based samplers learn to sample complex, high-dimensional distributions using energies or log densities alone.<n>We introduce a sequential bias along bespoke, information-rich, low-dimensional projections of atomic coordinates known as collective variables (CVs)<n>We are the first to demonstrate reactive sampling using a diffusion-based sampler, capturing bond breaking and formation with universal interatomic potentials at near-first-principles accuracy.
arXiv Detail & Related papers (2025-10-13T20:44:54Z) - Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design [53.93023688824764]
We address the problem of fine-tuning diffusion models for reward-guided generation in biomolecular design.<n>We propose an iterative distillation-based fine-tuning framework that enables diffusion models to optimize for arbitrary reward functions.<n>Our off-policy formulation, combined with KL divergence minimization, enhances training stability and sample efficiency compared to existing RL-based methods.
arXiv Detail & Related papers (2025-07-01T05:55:28Z) - Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities [85.83359661628575]
We propose Progressive Inference-Time Annealing (PITA) to learn diffusion-based samplers.<n>PITA combines two complementary techniques: Annealing of the Boltzmann distribution and Diffusion smoothing.<n>It enables equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates.
arXiv Detail & Related papers (2025-06-19T17:14:22Z) - Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces [5.716752583983991]
When the data distribution consists of n points, empirical diffusion models tend to reproduce existing data points.<n>This work shows that the memorization issue can be solved simply by applying an inertia update at the end of the empirical diffusion simulation.<n>We demonstrate that the distribution of samples from this model approximates the true data distribution on a $C2$ manifold of dimension $d$, within a Wasserstein-1 distance of order $O(n-frac2d+4)$.
arXiv Detail & Related papers (2025-05-05T09:40:41Z) - Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance [11.562962976129292]
We propose Potential Score Matching (PSM), an approach that utilizes the potential energy gradient to guide generative models.<n>PSM does not require exact energy functions and can debias sample distributions even when trained on limited and biased data.<n>The results demonstrate that molecular distributions generated by PSM more closely approximate the Boltzmann distribution compared to traditional diffusion models.
arXiv Detail & Related papers (2025-03-18T11:27:28Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Hessian-Informed Flow Matching [4.542719108171107]
Hessian-Informed Flow Matching is a novel approach that integrates the Hessian of an energy function into conditional flows.
This integration allows HI-FM to account for local curvature and anisotropic covariance structures.
Empirical evaluations on the MNIST and Lennard-Jones particles datasets demonstrate that HI-FM improves the likelihood of test samples.
arXiv Detail & Related papers (2024-10-15T09:34:52Z) - Provable Statistical Rates for Consistency Diffusion Models [87.28777947976573]
Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved.
This paper contributes towards the first statistical theory for consistency models, formulating their training as a distribution discrepancy minimization problem.
arXiv Detail & Related papers (2024-06-23T20:34:18Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Graph Diffusion Transformers for Multi-Conditional Molecular Generation [16.58392955245203]
We present the Graph Diffusion Transformer (Graph DiT) for multi-conditional molecular generation.
Graph DiT integrates an encoder to learn numerical and categorical property representations with the Transformer-based denoiser.
Results demonstrate the superiority of Graph DiT across nine metrics from distribution learning to condition control for molecular properties.
arXiv Detail & Related papers (2024-01-24T23:45:31Z) - Uncertainty-aware Surrogate Models for Airfoil Flow Simulations with Denoising Diffusion Probabilistic Models [26.178192913986344]
We make a first attempt to use denoising diffusion probabilistic models (DDPMs) to train an uncertainty-aware surrogate model for turbulence simulations.
Our results show DDPMs can successfully capture the whole distribution of solutions and, as a consequence, accurately estimate the uncertainty of the simulations.
We also evaluate an emerging generative modeling variant, flow matching, in comparison to regular diffusion models.
arXiv Detail & Related papers (2023-12-08T19:04:17Z) - Renormalizing Diffusion Models [0.7252027234425334]
We use diffusion models to learn inverse renormalization group flows of statistical and quantum field theories.
Our work provides an interpretation of multiscale diffusion models, and gives physically-inspired suggestions for diffusion models which should have novel properties.
arXiv Detail & Related papers (2023-08-23T18:02:31Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.