Data Distillation for Neural Network Potentials toward Foundational
Dataset
- URL: http://arxiv.org/abs/2311.05407v1
- Date: Thu, 9 Nov 2023 14:41:45 GMT
- Title: Data Distillation for Neural Network Potentials toward Foundational
Dataset
- Authors: Gang Seob Jung, Sangkeun Lee, Jong Youl Choi
- Abstract summary: generative models can swiftly propose promising materials for targeted applications.
However, the predicted properties of materials through the generative models often do not match with calculated properties through ab initio calculations.
This study utilized extended ensemble molecular dynamics (MD) to secure a broad range of liquid- and solid-phase configurations in one of the metallic systems, nickel.
We found that the NNP trained from the distilled data could predict different energy-minimized closed-pack crystal structures even though those structures were not explicitly part of the initial data.
- Score: 6.373914211316965
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning (ML) techniques and atomistic modeling have rapidly
transformed materials design and discovery. Specifically, generative models can
swiftly propose promising materials for targeted applications. However, the
predicted properties of materials through the generative models often do not
match with calculated properties through ab initio calculations. This
discrepancy can arise because the generated coordinates are not fully relaxed,
whereas the many properties are derived from relaxed structures. Neural
network-based potentials (NNPs) can expedite the process by providing relaxed
structures from the initially generated ones. Nevertheless, acquiring data to
train NNPs for this purpose can be extremely challenging as it needs to
encompass previously unknown structures. This study utilized extended ensemble
molecular dynamics (MD) to secure a broad range of liquid- and solid-phase
configurations in one of the metallic systems, nickel. Then, we could
significantly reduce them through active learning without losing much accuracy.
We found that the NNP trained from the distilled data could predict different
energy-minimized closed-pack crystal structures even though those structures
were not explicitly part of the initial data. Furthermore, the data can be
translated to other metallic systems (aluminum and niobium), without repeating
the sampling and distillation processes. Our approach to data acquisition and
distillation has demonstrated the potential to expedite NNP development and
enhance materials design and discovery by integrating generative models.
Related papers
- chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical Physics [0.0]
Neural Networks (NNs) are promising models for refining the accuracy of molecular dynamics.
Chemtrain is a framework to learn sophisticated NN potential models through customizable training routines and advanced training algorithms.
arXiv Detail & Related papers (2024-08-28T15:14:58Z) - Physics-Informed Neural Networks for Dynamic Process Operations with Limited Physical Knowledge and Data [38.39977540117143]
In chemical engineering, process data are expensive to acquire, and complex phenomena are difficult to fully model.
In particular, we focus on estimating states for which neither direct data nor observational equations are available.
We show that PINNs are capable of modeling processes when relatively few experimental data and only partially known mechanistic descriptions are available.
arXiv Detail & Related papers (2024-06-03T16:58:17Z) - Towards End-to-End Structure Solutions from Information-Compromised
Diffraction Data via Generative Deep Learning [6.617784410952713]
Machine learning (ML) and deep learning (DL) are promising approaches since they augment information in the degraded input signal with prior knowledge learned from large databases of already known structures.
Here we present a novel ML approach, a variational query-based multi-branch deep neural network that has the promise to be a robust but general tool to address this problem end-to-end.
The system achieves up to $93.4%$ average similarity with the ground truth on unseen materials, both with known and partially-known chemical composition information.
arXiv Detail & Related papers (2023-12-23T02:17:27Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - On the importance of catalyst-adsorbate 3D interactions for relaxed
energy predictions [98.70797778496366]
We investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate.
We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE.
arXiv Detail & Related papers (2023-10-10T14:57:04Z) - Neural FIM for learning Fisher Information Metrics from point cloud data [71.07939200676199]
We propose neural FIM, a method for computing the Fisher information metric (FIM) from point cloud data.
We demonstrate its utility in selecting parameters for the PHATE visualization method as well as its ability to obtain information pertaining to local volume illuminating branching points and cluster centers embeddings of a toy dataset and two single-cell datasets of IPSC reprogramming and PBMCs (immune cells)
arXiv Detail & Related papers (2023-06-01T17:36:13Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Transfer learning for chemically accurate interatomic neural network
potentials [0.0]
We show that pre-training the network parameters on data obtained from density functional calculations improves the sample efficiency of models trained on more accurate ab-initio data.
We provide GM-NN potentials pre-trained and fine-tuned on the ANI-1x and ANI-1ccx data sets, which can easily be fine-tuned on and applied to organic molecules.
arXiv Detail & Related papers (2022-12-07T19:21:01Z) - Learning neural network potentials from experimental data via
Differentiable Trajectory Reweighting [0.0]
Top-down approaches that learn neural network (NN) potentials directly from experimental data have received less attention.
We present the Differentiable Trajectory Reweighting (DiffTRe) method, which bypasses differentiation through the MD simulation for time-independent observables.
We show effectiveness of DiffTRe in learning NN potentials for an atomistic model of diamond and a coarse-grained model of water based on diverse experimental observables.
arXiv Detail & Related papers (2021-06-02T13:10:43Z) - Graph Neural Network for Hamiltonian-Based Material Property Prediction [56.94118357003096]
We present and compare several different graph convolution networks that are able to predict the band gap for inorganic materials.
The models are developed to incorporate two different features: the information of each orbital itself and the interaction between each other.
The results show that our model can get a promising prediction accuracy with cross-validation.
arXiv Detail & Related papers (2020-05-27T13:32:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.