Accelerating the Training and Improving the Reliability of Machine-Learned Interatomic Potentials for Strongly Anharmonic Materials through Active Learning
- URL: http://arxiv.org/abs/2409.11808v1
- Date: Wed, 18 Sep 2024 08:52:30 GMT
- Title: Accelerating the Training and Improving the Reliability of Machine-Learned Interatomic Potentials for Strongly Anharmonic Materials through Active Learning
- Authors: Kisung Kang, Thomas A. R. Purcell, Christian Carbogno, Matthias Scheffler,
- Abstract summary: We show that an active learning scheme that combines MD with MLIPs (MLIP-MD) and uncertainty estimates can avoid such problematic predictions.
In this work, we show that an active learning scheme that combines MD with MLIPs (MLIP-MD) and uncertainty estimates can avoid such problematic predictions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Molecular dynamics (MD) employing machine-learned interatomic potentials (MLIPs) serve as an efficient, urgently needed complement to ab initio molecular dynamics (aiMD). By training these potentials on data generated from ab initio methods, their averaged predictions can exhibit comparable performance to ab initio methods at a fraction of the cost. However, insufficient training sets might lead to an improper description of the dynamics in strongly anharmonic materials, because critical effects might be overlooked in relevant cases, or only incorrectly captured, or hallucinated by the MLIP when they are not actually present. In this work, we show that an active learning scheme that combines MD with MLIPs (MLIP-MD) and uncertainty estimates can avoid such problematic predictions. In short, efficient MLIP-MD is used to explore configuration space quickly, whereby an acquisition function based on uncertainty estimates and on energetic viability is employed to maximize the value of the newly generated data and to focus on the most unfamiliar but reasonably accessible regions of phase space. To verify our methodology, we screen over 112 materials and identify 10 examples experiencing the aforementioned problems. Using CuI and AgGaSe$_2$ as archetypes for these problematic materials, we discuss the physical implications for strongly anharmonic effects and demonstrate how the developed active learning scheme can address these issues.
Related papers
- Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data.
We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously.
Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z) - Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials [25.091146216183144]
Active learning uses biased or unbiased molecular dynamics to generate candidate pools.
Existing biased and unbiased MD-simulation methods are prone to miss either rare events or extrapolative regions.
This work demonstrates that MD, when biased by the MLIP's energy uncertainty, simultaneously captures extrapolative regions and rare events.
arXiv Detail & Related papers (2023-12-03T14:39:14Z) - Transfer learning for chemically accurate interatomic neural network
potentials [0.0]
We show that pre-training the network parameters on data obtained from density functional calculations improves the sample efficiency of models trained on more accurate ab-initio data.
We provide GM-NN potentials pre-trained and fine-tuned on the ANI-1x and ANI-1ccx data sets, which can easily be fine-tuned on and applied to organic molecules.
arXiv Detail & Related papers (2022-12-07T19:21:01Z) - Accurate, reliable and interpretable solubility prediction of druglike
molecules with attention pooling and Bayesian learning [1.8275108630751844]
In silico prediction of solubility has been studied for its utility in virtual screening and lead optimization.
Recently, machine learning (ML) methods using experimental data has been popular because physics-based methods are not suitable for high- throughput tasks.
In this paper, we develop graph neural networks (GNNs) with the self-attention readout layer to improve prediction performance.
arXiv Detail & Related papers (2022-09-29T07:48:10Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Using Machine Learning to Anticipate Tipping Points and Extrapolate to
Post-Tipping Dynamics of Non-Stationary Dynamical Systems [0.0]
We consider the machine learning task of predicting tipping point transitions and long-term post-tipping-point behavior.
We investigate the extent to which ML methods are capable of accomplishing useful results for this task, as well as conditions under which they fail.
The main conclusion of this paper is that ML-based approaches are promising tools for predicting the behavior of non-stationary dynamical systems.
arXiv Detail & Related papers (2022-07-01T16:06:12Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Coupled and Uncoupled Dynamic Mode Decomposition in Multi-Compartmental
Systems with Applications to Epidemiological and Additive Manufacturing
Problems [58.720142291102135]
We show that Dynamic Decomposition (DMD) may be a powerful tool when applied to nonlinear problems.
In particular, we show interesting numerical applications to a continuous delayed-SIRD model for Covid-19.
arXiv Detail & Related papers (2021-10-12T21:42:14Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z) - Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost
Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features.
Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets.
We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z) - Automated discovery of a robust interatomic potential for aluminum [4.6028828826414925]
Machine learning (ML) based potentials aim for faithful emulation of quantum mechanics (QM) calculations at drastically reduced computational cost.
We present a highly automated approach to dataset construction using the principles of active learning (AL)
We demonstrate this approach by building an ML potential for aluminum (ANI-Al)
To demonstrate transferability, we perform a 1.3M atom shock simulation, and show that ANI-Al predictions agree very well with DFT calculations on local atomic environments sampled from the nonequilibrium dynamics.
arXiv Detail & Related papers (2020-03-10T19:06:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.