Scalable Machine Learning Force Fields for Macromolecular Systems Through Long-Range Aware Message Passing
- URL: http://arxiv.org/abs/2601.03774v1
- Date: Wed, 07 Jan 2026 10:12:34 GMT
- Title: Scalable Machine Learning Force Fields for Macromolecular Systems Through Long-Range Aware Message Passing
- Authors: Chu Wang, Lin Huang, Xinran Wei, Tao Qin, Arthur Jiang, Lixue Cheng, Jia Zhang,
- Abstract summary: Machine learning force fields (MLFFs) have revolutionized molecular simulations by providing quantum mechanical accuracy at the speed of molecular mechanical computations.<n>However, a fundamental reliance of these models on fixed-cutoff architectures limits their applicability to macromolecular systems where long-range interactions dominate.<n>We demonstrate that this locality constraint causes force prediction errors to scale monotonically with system size, revealing a critical architectural bottleneck.<n>We introduce E2Former-LSR, an equivariant transformer that explicitly integrates long-range attention blocks. E2Former-LSR achieves stable error scaling, superior fidelity in capturing non-covalent decay, and
- Score: 18.50744268453995
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning force fields (MLFFs) have revolutionized molecular simulations by providing quantum mechanical accuracy at the speed of molecular mechanical computations. However, a fundamental reliance of these models on fixed-cutoff architectures limits their applicability to macromolecular systems where long-range interactions dominate. We demonstrate that this locality constraint causes force prediction errors to scale monotonically with system size, revealing a critical architectural bottleneck. To overcome this, we establish the systematically designed MolLR25 ({Mol}ecules with {L}ong-{R}ange effect) benchmark up to 1200 atoms, generated using high-fidelity DFT, and introduce E2Former-LSR, an equivariant transformer that explicitly integrates long-range attention blocks. E2Former-LSR exhibits stable error scaling, achieves superior fidelity in capturing non-covalent decay, and maintains precision on complex protein conformations. Crucially, its efficient design provides up to 30% speedup compared to purely local models. This work validates the necessity of non-local architectures for generalizable MLFFs, enabling high-fidelity molecular dynamics for large-scale chemical and biological systems.
Related papers
- From Complex Dynamics to DynFormer: Rethinking Transformers for PDEs [6.873342825786888]
Transformer-based neural operators have emerged as powerful data-driven alternatives.<n>We propose DynFormer, a novel dynamics-informed neural operator.<n>We show that DynFormer achieves up to a 95% reduction in relative error compared to state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-03T15:45:09Z) - Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics [51.85385061275941]
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics.<n>Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation.<n>We present STAR-MD, a scalable diffusion model that generates physically plausible protein trajectories over micro-scale timescales.
arXiv Detail & Related papers (2026-02-02T14:13:28Z) - OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction [63.318434943975255]
We introduce OXtal, a large-scale 100M parameter all-atom diffusion model that learns the conditional joint distribution over intramolecular conformations and periodic packing.<n>By leveraging a large dataset of 600K experimentally validated crystal structures, OXtal achieves orders-of-improvement over prior ab initio machine learning CSP methods.<n> OXtal attains over 80% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.
arXiv Detail & Related papers (2025-12-07T20:46:30Z) - A Universal Deep Learning Force Field for Molecular Dynamic Simulation and Vibrational Spectra Prediction [24.66572813199126]
We integrate our deep equivariant tensor attention network (DetaNet) with a velocity-Verlet integrator to enable machine learning molecular dynamics (MLMD) simulations for spectral prediction.<n>We show that the DetaNet-based MD approach achieves near-experimental spectral accuracy with speedups up to three orders of magnitude over AIMD.<n>Overall, this work establishes a universal machine learning force field and framework that enable fast, accurate, and broadly applicable dynamic simulations and IR/Raman spectral predictions.
arXiv Detail & Related papers (2025-10-05T14:36:33Z) - HemePLM-Diffuse: A Scalable Generative Framework for Protein-Ligand Dynamics in Large Biomolecular System [0.0]
We introduce HemeM-Diffuse, an innovative generative transformer model that is designed for accurate simulation of protein-ligand trajectories.<n>We show its capabilities using the 3CQV HEME system, showing enhanced accuracy and scalability compared to leading models such as TorchMD-Net, MDGEN, and Uni-Mol.
arXiv Detail & Related papers (2025-08-07T17:29:52Z) - Aligned Manifold Property and Topology Point Clouds for Learning Molecular Properties [55.2480439325792]
This work introduces AMPTCR, a molecular surface representation that combines local quantum-derived scalar fields and custom topological descriptors within an aligned point cloud format.<n>For molecular weight, results confirm that AMPTCR encodes physically meaningful data, with a validation R2 of 0.87.<n>In the bacterial inhibition task, AMPTCR enables both classification and direct regression of E. coli inhibition values.
arXiv Detail & Related papers (2025-07-22T04:35:50Z) - A Scalable and Quantum-Accurate Foundation Model for Biomolecular Force Field via Linearly Tensorized Quadrangle Attention [6.749581549330875]
We present LiTEN, a novel AI-based force field framework for atomistic biomolecular simulations.<n>Building on LiTEN, LiTEN-FF is a robust AIFF foundation model, pre-trained on the nablaDFT dataset for broad chemical generalization.<n>LiTEN achieves state-of-the-art (SOTA) performance across most evaluation subsets of rMD17, MD22, and Chignolin, outperforming leading models such as MACE, NequIP, and EquiFormer.
arXiv Detail & Related papers (2025-07-01T15:52:39Z) - DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
We present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task.<n>To develop a robust decoder that bridges latent embeddings and molecular structures, we pretrain the diffusion decoder with fingerprint-structure pairs.<n>Experiments on established benchmarks show that DiffMS outperforms existing models on de novo molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z) - Predicting molecular dipole moments by combining atomic partial charges
and atomic dipoles [3.0980025155565376]
"MuML" models are fitted together to reproduce molecular $boldsymbolmu$ computed using high-level coupled-cluster theory.
We demonstrate that the uncertainty in the predictions can be estimated reliably using a calibrated committee model.
arXiv Detail & Related papers (2020-03-27T14:35:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.