Performance of universal machine-learned potentials with explicit long-range interactions in biomolecular simulations
- URL: http://arxiv.org/abs/2508.10841v1
- Date: Thu, 14 Aug 2025 17:08:34 GMT
- Title: Performance of universal machine-learned potentials with explicit long-range interactions in biomolecular simulations
- Authors: Viktor Zaverkin, Matheus Ferraz, Francesco Alesiani, Mathias Niepert,
- Abstract summary: Universal machine-learned potentials promise transferable accuracy across compositional and vibrational degrees of freedom.<n>This work systematically evaluates equivariant message-passing architectures trained on the SPICE-v2 dataset with and without explicit long-range dispersion and electrostatics.
- Score: 21.340102594388348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Universal machine-learned potentials promise transferable accuracy across compositional and vibrational degrees of freedom, yet their application to biomolecular simulations remains underexplored. This work systematically evaluates equivariant message-passing architectures trained on the SPICE-v2 dataset with and without explicit long-range dispersion and electrostatics. We assess the impact of model size, training data composition, and electrostatic treatment across in- and out-of-distribution benchmark datasets, as well as molecular simulations of bulk liquid water, aqueous NaCl solutions, and biomolecules, including alanine tripeptide, the mini-protein Trp-cage, and Crambin. While larger models improve accuracy on benchmark datasets, this trend does not consistently extend to properties obtained from simulations. Predicted properties also depend on the composition of the training dataset. Long-range electrostatics show no systematic impact across systems. However, for Trp-cage, their inclusion yields increased conformational variability. Our results suggest that imbalanced datasets and immature evaluation practices currently challenge the applicability of universal machine-learned potentials to biomolecular simulations.
Related papers
- Protein Language Model Embeddings Improve Generalization of Implicit Transfer Operators [10.025462072265706]
We show that incorporating auxiliary sources of information can improve the data efficiency and generalization of implicit transfer operators for molecular dynamics.<n>Our approach, PLaTITO, achieves state-of-the-art performance on equilibrium sampling benchmarks for out-of-distribution protein systems.
arXiv Detail & Related papers (2026-02-11T09:26:12Z) - Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Independent Component Discovery in Temporal Count Data [46.526610368455096]
We introduce a generative framework for independent component analysis of temporal count data, combining regime-adaptive dynamics with Poisson log-normal emissions.<n>The model identifies disentangled components with regime-dependent contributions, enabling representation learning and perturbations analysis.
arXiv Detail & Related papers (2026-01-29T13:30:10Z) - Surface Stability Modeling with Universal Machine Learning Interatomic Potentials: A Comprehensive Cleavage Energy Benchmarking Study [0.0]
Machine learning interatomic potentials (MLIPs) have revolutionized computational materials science.<n>No systematic evaluation has assessed how well these universal MLIPs can predict cleavage energies.<n>We present a benchmark of 19 state-of-the-art uMLIPs for cleavage energy prediction.
arXiv Detail & Related papers (2025-08-29T14:24:47Z) - Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - A predictive machine learning force field framework for liquid electrolyte development [11.463808946378743]
We introduce BAMBOO, a predictive framework for molecular dynamics (MD) simulations, with a demonstration of its capability in the context of liquid electrolyte for lithium batteries.<n>We design a physics-inspired graph equivariant transformer architecture as the backbone of BAMBOO to learn from quantum mechanical simulations.<n>We also introduce an ensemble knowledge distillation approach and apply it to MLFFs to reduce the fluctuation of observations from MD simulations.
arXiv Detail & Related papers (2024-04-10T17:31:49Z) - Enhanced sampling of robust molecular datasets with uncertainty-based
collective variables [0.0]
We propose a method that leverages uncertainty as the collective variable (CV) to guide the acquisition of chemically-relevant data points.
This approach employs a Gaussian Mixture Model-based uncertainty metric from a single model as the CV for biased molecular dynamics simulations.
arXiv Detail & Related papers (2024-02-06T06:42:51Z) - Evaluating the Transferability of Machine-Learned Force Fields for
Material Property Modeling [2.494740426749958]
We present a more comprehensive set of benchmarking tests for evaluating the transferability of machine-learned force fields.
We use a graph neural network (GNN)-based force field coupled with the OpenMM package to carry out MD simulations for Argon.
Our results show that the model can accurately capture the behavior of the solid phase only when the configurations from the solid phase are included in the training dataset.
arXiv Detail & Related papers (2023-01-10T00:25:48Z) - Conditional Generative Models for Simulation of EMG During Naturalistic
Movements [45.698312905115955]
We present a conditional generative neural network trained adversarially to generate motor unit activation potential waveforms.
We demonstrate the ability of such a model to predictively interpolate between a much smaller number of numerical model's outputs with a high accuracy.
arXiv Detail & Related papers (2022-11-03T14:49:02Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - Designing Machine Learning Surrogates using Outputs of Molecular
Dynamics Simulations as Soft Labels [0.0]
We show that statistical uncertainties associated with the outputs of molecular dynamics simulations can be utilized to train artificial neural networks.
We design soft labels for the simulation outputs by incorporating the uncertainties in the estimated average output quantities.
The approach is illustrated with the design of a surrogate for molecular dynamics simulations of confined electrolytes.
arXiv Detail & Related papers (2021-10-27T19:00:40Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.