Elign: Equivariant Diffusion Model Alignment from Foundational Machine Learning Force Fields
- URL: http://arxiv.org/abs/2601.21985v1
- Date: Thu, 29 Jan 2026 17:00:09 GMT
- Title: Elign: Equivariant Diffusion Model Alignment from Foundational Machine Learning Force Fields
- Authors: Yunyang Li, Lin Huang, Luojia Xia, Wenhe Zhang, Mark Gerstein,
- Abstract summary: We present Elign, a post-training framework that amortizes both costs.<n>We replace expensive DFT evaluations with a faster, pretrained foundational machine-learning force field.<n>Experiments show that Elign generates conformations with lower gold-standard DFT energies and forces, while improving stability.
- Score: 7.740456623132954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models for 3D molecular conformations must respect Euclidean symmetries and concentrate probability mass on thermodynamically favorable, mechanically stable structures. However, E(3)-equivariant diffusion models often reproduce biases from semi-empirical training data rather than capturing the equilibrium distribution of a high-fidelity Hamiltonian. While physics-based guidance can correct this, it faces two computational bottlenecks: expensive quantum-chemical evaluations (e.g., DFT) and the need to repeat such queries at every sampling step. We present Elign, a post-training framework that amortizes both costs. First, we replace expensive DFT evaluations with a faster, pretrained foundational machine-learning force field (MLFF) to provide physical signals. Second, we eliminate repeated run-time queries by shifting physical steering to the training phase. To achieve the second amortization, we formulate reverse diffusion as a reinforcement learning problem and introduce Force--Energy Disentangled Group Relative Policy Optimization (FED-GRPO) to fine-tune the denoising policy. FED-GRPO includes a potential-based energy reward and a force-based stability reward, which are optimized and group-normalized independently. Experiments show that Elign generates conformations with lower gold-standard DFT energies and forces, while improving stability. Crucially, inference remains as fast as unguided sampling, since no energy evaluations are required during generation.
Related papers
- Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading [86.6550968435969]
Most PDE foundation models are pretrained and fine-tuned on fluid-centric benchmarks.<n>We benchmark out-of-distribution transfer on two discontinuity-dominated regimes in which shocks, evolving interfaces, and fracture produce highly non-smooth fields.<n>We evaluate two open-source PDE foundation models, POSEIDON and MORPH, and compare fine-tuning from pretrained weights against training from scratch across training-set sizes to quantify sample efficiency under distribution shift.
arXiv Detail & Related papers (2026-03-04T18:19:35Z) - Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Stabilizing Physics-Informed Consistency Models via Structure-Preserving Training [7.031010831953522]
We propose a physics-informed consistency modeling framework for solving partial differential equations (PDEs)<n>We identify a key stability challenge in physics-constrained consistency training, where PDE residuals can drive the model toward trivial or degenerate solutions.<n>We introduce a structure-preserving two-stage training strategy that decouples distribution learning from physics enforcement.
arXiv Detail & Related papers (2026-02-10T00:40:19Z) - PHDME: Physics-Informed Diffusion Models without Explicit Governing Equations [0.496981595868944]
Diffusion models provide expressive priors for forecasting trajectories of dynamical systems, but are typically unreliable in the sparse data regime.<n>We introduce textbfPHDME, a port-Hamiltonian diffusion framework designed for emphsparse observations and emphincomplete physics.<n>Experiments on PDE benchmarks and a real-world spring system show improved accuracy and physical consistency under data scarcity.
arXiv Detail & Related papers (2026-01-29T03:53:48Z) - Enhanced Sampling for Efficient Learning of Coarse-Grained Machine Learning Potentials [2.8355616606687506]
We introduce enhanced sampling to bias along CG degrees of freedom for data generation, and then re-compute the forces with respect to the unbiased potential.<n>This strategy simultaneously shortens the simulation time required to produce equilibrated data and enriches sampling in transition regions, while preserving the correct PMF.<n>Our findings support the use of enhanced sampling for force matching as a promising direction to improve the accuracy and reliability of CGs.
arXiv Detail & Related papers (2025-10-13T08:40:13Z) - Guiding Diffusion Models with Reinforcement Learning for Stable Molecule Generation [16.01877423456416]
Reinforcement Learning with Physical Feedback (RLPF) is a novel framework that extends Denoising Diffusion Policy Optimization to 3D molecular generation.<n>RLPF introduces reward functions derived from force-field evaluations to guide the generation toward energetically stable and physically meaningful structures.<n> Experiments on the QM9 and GEOM-drug datasets demonstrate that RLPF significantly improves molecular stability compared to existing methods.
arXiv Detail & Related papers (2025-08-22T16:44:55Z) - Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models [50.77646970127369]
We propose an energy-based diffusion model with a Fokker--Planck-derived regularization term to enforce consistency.<n>We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins.
arXiv Detail & Related papers (2025-06-20T16:38:29Z) - Flow Matching Meets PDEs: A Unified Framework for Physics-Constrained Generation [21.321570407292263]
We propose Physics-Based Flow Matching, a generative framework that embeds physical constraints, both PDE residuals and algebraic relations, into the flow matching objective.<n>We show that our approach yields up to an $8times$ more accurate physical residuals compared to FM, while clearly outperforming existing algorithms in terms of distributional accuracy.
arXiv Detail & Related papers (2025-06-10T09:13:37Z) - Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [55.95767828747407]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z) - May the Force be with You: Unified Force-Centric Pre-Training for 3D
Molecular Conformations [19.273404278711794]
We propose a force-centric pretraining model for 3D molecular conformations covering both equilibrium and off-equilibrium data.
For equilibrium data, we introduce zero-force regularization and forced-based denoising techniques to approximate near-equilibrium forces.
Experiments show that, with our pre-training objective, we increase forces accuracy by around 3 times compared to the un-pre-trained Equivariant Transformer model.
arXiv Detail & Related papers (2023-08-24T01:54:02Z) - Improving and generalizing flow-based generative models with minibatch
optimal transport [90.01613198337833]
We introduce the generalized conditional flow matching (CFM) technique for continuous normalizing flows (CNFs)
CFM features a stable regression objective like that used to train the flow in diffusion models but enjoys the efficient inference of deterministic flow models.
A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference.
arXiv Detail & Related papers (2023-02-01T14:47:17Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.