Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials
- URL: http://arxiv.org/abs/2312.01416v2
- Date: Sat, 02 Nov 2024 11:26:08 GMT
- Title: Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials
- Authors: Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, Johannes Kästner,
- Abstract summary: Active learning uses biased or unbiased molecular dynamics to generate candidate pools.
Existing biased and unbiased MD-simulation methods are prone to miss either rare events or extrapolative regions.
This work demonstrates that MD, when biased by the MLIP's energy uncertainty, simultaneously captures extrapolative regions and rare events.
- Score: 25.091146216183144
- License:
- Abstract: Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning, which uses biased or unbiased molecular dynamics (MD) to generate candidate pools, aims to address this objective. Existing biased and unbiased MD-simulation methods, however, are prone to miss either rare events or extrapolative regions -- areas of the configurational space where unreliable predictions are made. This work demonstrates that MD, when biased by the MLIP's energy uncertainty, simultaneously captures extrapolative regions and rare events, which is crucial for developing uniformly accurate MLIPs. Furthermore, exploiting automatic differentiation, we enhance bias-forces-driven MD with the concept of bias stress. We employ calibrated gradient-based uncertainties to yield MLIPs with similar or, sometimes, better accuracy than ensemble-based methods at a lower computational cost. Finally, we apply uncertainty-biased MD to alanine dipeptide and MIL-53(Al), generating MLIPs that represent both configurational spaces more accurately than models trained with conventional MD.
Related papers
- Muti-Fidelity Prediction and Uncertainty Quantification with Laplace Neural Operators for Parametric Partial Differential Equations [6.03891813540831]
Laplace Neural Operators (LNOs) have emerged as a promising approach in scientific machine learning.
We propose multi-fidelity Laplace Neural Operators (MF-LNOs), which combine a low-fidelity base model with parallel linear/nonlinear HF correctors and dynamic inter-fidelity weighting.
This allows us to exploit correlations between LF and HF datasets and achieve accurate inference of quantities of interest.
arXiv Detail & Related papers (2025-02-01T20:38:50Z) - Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data [2.1248439796866228]
This study investigates pNML's learnability for linear regression and neural networks.
It demonstrates that pNML can improve the performance and robustness of these models on various tasks.
arXiv Detail & Related papers (2024-12-10T13:58:19Z) - Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space [72.52365911990935]
We introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling.
Our results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks.
arXiv Detail & Related papers (2024-10-02T17:53:23Z) - Mitigating Exposure Bias in Score-Based Generation of Molecular Conformations [6.442534896075223]
We propose a method for measuring exposure bias in Score-Based Generative Models used for molecular conformation generation.
We design a new compensation algorithm Input Perturbation (IP), which is adapted from a method originally designed for DPMs only.
We achieve new state-of-the-art performance on the GEOM-Drugs dataset and are on par with GEOM-QM9.
arXiv Detail & Related papers (2024-09-21T04:54:37Z) - Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI)
We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization.
Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z) - A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics [73.35846234413611]
In drug discovery, molecular dynamics (MD) simulation provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites.
We propose NeuralMD, the first machine learning (ML) surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics.
We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K$times$ speedup compared to standard numerical MD simulations.
arXiv Detail & Related papers (2024-01-26T09:35:17Z) - Accurate machine learning force fields via experimental and simulation
data fusion [0.0]
Machine Learning (ML)-based force fields are attracting ever-increasing interest due to their capacity to span scales of classical interatomic potentials at quantum-level accuracy.
Here we leverage both Density Functional Theory (DFT) calculations and experimentally measured mechanical properties and lattice parameters to train an ML potential of titanium.
We demonstrate that the fused data learning strategy can concurrently satisfy all target objectives, thus resulting in a molecular model of higher accuracy compared to the models trained with a single source data.
arXiv Detail & Related papers (2023-08-17T18:22:19Z) - Physics-informed machine learning with differentiable programming for
heterogeneous underground reservoir pressure management [64.17887333976593]
Avoiding over-pressurization in subsurface reservoirs is critical for applications like CO2 sequestration and wastewater injection.
Managing the pressures by controlling injection/extraction are challenging because of complex heterogeneity in the subsurface.
We use differentiable programming with a full-physics model and machine learning to determine the fluid extraction rates that prevent over-pressurization.
arXiv Detail & Related papers (2022-06-21T20:38:13Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - A data-driven peridynamic continuum model for upscaling molecular
dynamics [3.1196544696082613]
We propose a learning framework to extract, from molecular dynamics data, an optimal Linear Peridynamic Solid model.
We provide sufficient well-posedness conditions for discretized LPS models with sign-changing influence functions.
This framework guarantees that the resulting model is mathematically well-posed, physically consistent, and that it generalizes well to settings that are different from the ones used during training.
arXiv Detail & Related papers (2021-08-04T07:07:47Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.