Related papers: Considerations in the use of ML interaction potentials for free energy calculations

Considerations in the use of ML interaction potentials for free energy calculations

URL: http://arxiv.org/abs/2403.13952v1
Date: Wed, 20 Mar 2024 19:49:21 GMT
Title: Considerations in the use of ML interaction potentials for free energy calculations
Authors: Orlando A. Mendible, Jonathan K. Whitmer, Yamil J. Colón,
Abstract summary: Machine learning potentials (MLPs) offer the potential to accurately model the energy and free energy landscapes of molecules. We examined how the distribution of collective variables (CVs) in the training data affects accuracy in determining the free energy surface (FES) of systems. Findings for butane revealed that training data coverage of key FES regions ensures model accuracy regardless of CV distribution.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning potentials (MLPs) offer the potential to accurately model the energy and free energy landscapes of molecules with the precision of quantum mechanics and an efficiency similar to classical simulations. This research focuses on using equivariant graph neural networks MLPs due to their proven effectiveness in modeling equilibrium molecular trajectories. A key issue addressed is the capability of MLPs to accurately predict free energies and transition states by considering both the energy and the diversity of molecular configurations. We examined how the distribution of collective variables (CVs) in the training data affects MLP accuracy in determining the free energy surface (FES) of systems, using Metadynamics simulations for butane and alanine dipeptide (ADP). The study involved training forty-three MLPs, half based on classical molecular dynamics data and the rest on ab initio computed energies. The MLPs were trained using different distributions that aim to replicate hypothetical scenarios of sampled CVs obtained if the underlying FES of the system was unknown. Findings for butane revealed that training data coverage of key FES regions ensures model accuracy regardless of CV distribution. However, missing significant FES regions led to correct potential energy predictions but failed free energy reconstruction. For ADP, models trained on classical dynamics data were notably less accurate, while ab initio-based MLPs predicted potential energy well but faltered on free energy predictions. These results emphasize the challenge of assembling an all-encompassing training set for accurate FES prediction and highlight the importance of understanding the FES in preparing training data. The study points out the limitations of MLPs in free energy calculations, stressing the need for comprehensive data that encompasses the system's full FES for effective model training.

Related papers

Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials [34.82692226532414]
Machine learning interatomic potentials (MLIPs) are a promising tool to accelerate atomistic simulations and molecular property prediction. The quality of MLIPs depends on the quantity of available training data as well as the quantum chemistry (QC) level of theory used to generate that data. We present an ensemble knowledge distillation (EKD) method to improve MLIP accuracy when trained to energy-only datasets.
arXiv Detail & Related papers (2025-03-18T14:32:51Z)
Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance [11.562962976129292]
We propose Potential Score Matching (PSM), an approach that utilizes the potential energy gradient to guide generative models. PSM does not require exact energy functions and can debias sample distributions even when trained on limited and biased data. The results demonstrate that molecular distributions generated by PSM more closely approximate the Boltzmann distribution compared to traditional diffusion models.
arXiv Detail & Related papers (2025-03-18T11:27:28Z)
Enhancing Machine Learning Potentials through Transfer Learning across Chemical Elements [0.0]
Machine Learning Potentials (MLPs) can enable simulations of ab initio accuracy at orders of magnitude lower computational cost. Here, we introduce transfer learning of potential energy surfaces between chemically similar elements. We demonstrate that transfer learning surpasses traditional training from scratch in force prediction, leading to more stable simulations and improved temperature transferability.
arXiv Detail & Related papers (2025-02-19T08:20:54Z)
Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more computation-efficient metric for performance estimation. We present FLP-M, a fundamental approach for performance prediction that addresses the practical need to integrate datasets from multiple sources during pre-training.
arXiv Detail & Related papers (2024-10-11T04:57:48Z)
Towards a Theoretical Understanding of Memorization in Diffusion Models [76.85077961718875]
Diffusion probabilistic models (DPMs) are being employed as mainstream models for Generative Artificial Intelligence (GenAI) We provide a theoretical understanding of memorization in both conditional and unconditional DPMs under the assumption of model convergence. We propose a novel data extraction method named textbfSurrogate condItional Data Extraction (SIDE) that leverages a time-dependent classifier trained on the generated data as a surrogate condition to extract training data from unconditional DPMs.
arXiv Detail & Related papers (2024-10-03T13:17:06Z)
Hydrogen under Pressure as a Benchmark for Machine-Learning Interatomic Potentials [0.0]
Machine-learning interatomic potentials (MLPs) are fast, data-driven surrogate models of atomistic systems' potential energy surfaces. We present a benchmark that automatically quantifies the performance of a liquid-liquid phase transition in hydrogen under pressure.
arXiv Detail & Related papers (2024-09-20T10:44:40Z)
Generalizability of Graph Neural Network Force Fields for Predicting Solid-State Properties [8.405078403907241]
Machine-learned force fields (MLFFs) promise to offer a computationally efficient alternative to ab initio simulations for complex molecular systems.<n>This work investigates the ability of a graph neural network (GNN)-based MLFF to describe solid-state phenomena not explicitly included during training.
arXiv Detail & Related papers (2024-09-16T02:14:26Z)
Physics-Informed Weakly Supervised Learning for Interatomic Potentials [17.165117198519248]
We introduce a physics-informed, weakly supervised approach for training machine-learned interatomic potentials. We demonstrate reduced energy and force errors -- often lower by a factor of two -- for various baseline models and benchmark data sets.
arXiv Detail & Related papers (2024-07-23T12:49:04Z)
Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI) We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization. Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z)
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks. We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z)
Analysis of Atom-level pretraining with Quantum Mechanics (QM) data for Graph Neural Networks Molecular property models [0.0]
We show how atom-level pretraining with quantum mechanics (QM) data can mitigate violations of assumptions regarding the distributional similarity between training and test data. This is the first time that hidden state molecular representations are analyzed to compare the effects of molecule-level and atom-level pretraining on QM data.
arXiv Detail & Related papers (2024-05-23T17:51:05Z)
Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning [3.321322648845526]
Machine learning interatomic potentials (MLIPs) have introduced a new paradigm for atomic simulations. Recent advancements have seen the emergence of universal MLIPs (uMLIPs) that are pre-trained on diverse materials datasets. We show that their performance in extrapolating to out-of-distribution complex atomic environments remains unclear.
arXiv Detail & Related papers (2024-05-11T22:30:47Z)
PiRD: Physics-informed Residual Diffusion for Flow Field Reconstruction [5.06136344261226]
CNN-based methods for data fidelity enhancement rely on low-fidelity data patterns and distributions during the training phase. Our proposed model - Physics-informed Residual Diffusion - demonstrates the capability to elevate the quality of data from both standard low-fidelity inputs. Experimental results have shown that our approach can effectively reconstruct high-quality outcomes for two-dimensional turbulent flows without requiring retraining.
arXiv Detail & Related papers (2024-04-12T11:45:51Z)
Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction [74.84850523400873]
We show that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training. It enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge. It is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries.
arXiv Detail & Related papers (2024-03-14T16:52:57Z)
Active learning of Boltzmann samplers and potential energies with quantum mechanical accuracy [1.7633275579210346]
We develop an approach combining enhanced sampling with deep generative models and active learning of a machine learning potential. We apply this method to study the isomerization of an ultrasmall silver nanocluster, belonging to a set of systems with diverse applications in the fields of medicine and biology.
arXiv Detail & Related papers (2024-01-29T19:01:31Z)
Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials [25.091146216183144]
Active learning uses biased or unbiased molecular dynamics to generate candidate pools. Existing biased and unbiased MD-simulation methods are prone to miss either rare events or extrapolative regions. This work demonstrates that MD, when biased by the MLIP's energy uncertainty, simultaneously captures extrapolative regions and rare events.
arXiv Detail & Related papers (2023-12-03T14:39:14Z)
Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues. PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target. Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z)
SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z)
Evaluating the Transferability of Machine-Learned Force Fields for Material Property Modeling [2.494740426749958]
We present a more comprehensive set of benchmarking tests for evaluating the transferability of machine-learned force fields. We use a graph neural network (GNN)-based force field coupled with the OpenMM package to carry out MD simulations for Argon. Our results show that the model can accurately capture the behavior of the solid phase only when the configurations from the solid phase are included in the training dataset.
arXiv Detail & Related papers (2023-01-10T00:25:48Z)
Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium. Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z)
Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models. PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z)
Prediction of liquid fuel properties using machine learning models with Gaussian processes and probabilistic conditional generative learning [56.67751936864119]
The present work aims to construct cheap-to-compute machine learning (ML) models to act as closure equations for predicting the physical properties of alternative fuels. Those models can be trained using the database from MD simulations and/or experimental measurements in a data-fusion-fidelity approach. The results show that ML models can predict accurately the fuel properties of a wide range of pressure and temperature conditions.
arXiv Detail & Related papers (2021-10-18T14:43:50Z)
Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging. We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence. Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.