Uncertainty quantification for predictions of atomistic neural networks
- URL: http://arxiv.org/abs/2207.06916v1
- Date: Thu, 14 Jul 2022 13:39:43 GMT
- Title: Uncertainty quantification for predictions of atomistic neural networks
- Authors: Luis Itza Vazquez-Salazar, Eric D. Boittier, and M. Meuwly
- Abstract summary: This paper explores the value of uncertainty quantification on predictions for trained neural networks (NNs) on quantum chemical reference data.
The architecture of the PhysNet NN was suitably modified and the resulting model was evaluated with different metrics to quantify calibration, quality of predictions, and whether prediction error and the predicted uncertainty can be correlated.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The value of uncertainty quantification on predictions for trained neural
networks (NNs) on quantum chemical reference data is quantitatively explored.
For this, the architecture of the PhysNet NN was suitably modified and the
resulting model was evaluated with different metrics to quantify calibration,
quality of predictions, and whether prediction error and the predicted
uncertainty can be correlated. The results from training on the QM9 database
and evaluating data from the test set within and outside the distribution
indicate that error and uncertainty are not linearly related. The results
clarify that noise and redundancy complicate property prediction for molecules
even in cases for which changes - e.g. double bond migration in two otherwise
identical molecules - are small. The model was then applied to a real database
of tautomerization reactions. Analysis of the distance between members in
feature space combined with other parameters shows that redundant information
in the training dataset can lead to large variances and small errors whereas
the presence of similar but unspecific information returns large errors but
small variances. This was, e.g., observed for nitro-containing aliphatic chains
for which predictions were difficult although the training set contained
several examples for nitro groups bound to aromatic molecules. This underlines
the importance of the composition of the training data and provides chemical
insight into how this affects the prediction capabilities of a ML model.
Finally, the approach put forward can be used for information-based improvement
of chemical databases for target applications through active learning
optimization.
Related papers
- Data-Error Scaling in Machine Learning on Natural Discrete Combinatorial Mutation-prone Sets: Case Studies on Peptides and Small Molecules [0.0]
We investigate trends in the data-error scaling behavior of machine learning (ML) models trained on discrete spaces that are prone-to-mutation.
In contrast to typical data-error scaling, our results showed discontinuous monotonic phase transitions during learning.
We present an alternative strategy to normalize learning curves and the concept of mutant based shuffling.
arXiv Detail & Related papers (2024-05-08T16:04:50Z) - Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search [2.711812013460678]
We introduce AutoGNNUQ, an automated uncertainty quantification (UQ) approach for molecular property prediction.
Our approach employs variance decomposition to separate data (aleatoric) and model (epistemic) uncertainties, providing valuable insights for reducing them.
AutoGNNUQ has broad applicability in domains such as drug discovery and materials science, where accurate uncertainty quantification is crucial for decision-making.
arXiv Detail & Related papers (2023-07-19T20:03:42Z) - Transition Role of Entangled Data in Quantum Machine Learning [51.6526011493678]
Entanglement serves as the resource to empower quantum computing.
Recent progress has highlighted its positive impact on learning quantum dynamics.
We establish a quantum no-free-lunch (NFL) theorem for learning quantum dynamics using entangled data.
arXiv Detail & Related papers (2023-06-06T08:06:43Z) - Learning inducing points and uncertainty on molecular data by scalable
variational Gaussian processes [0.0]
We show that variational learning of the inducing points in a molecular descriptor space improves the prediction of energies and atomic forces on two molecular dynamics datasets.
We extend our study to a large molecular crystal system, showing that variational GP models perform well for predicting atomic forces by efficiently learning a sparse representation of the dataset.
arXiv Detail & Related papers (2022-07-16T10:41:41Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Calibrated Uncertainty for Molecular Property Prediction using Ensembles
of Message Passing Neural Networks [11.47132155400871]
We extend a message passing neural network designed specifically for predicting properties of molecules and materials.
We show that our approach results in accurate models for predicting molecular formation energies with calibrated uncertainty.
arXiv Detail & Related papers (2021-07-13T13:28:11Z) - When in Doubt: Neural Non-Parametric Uncertainty Quantification for
Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions.
Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations.
We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Bayesian Graph Neural Networks for Molecular Property Prediction [15.160090982544867]
This study benchmarks a set of Bayesian methods applied to a directed MPNN, using the QM9 regression dataset.
We find that capturing uncertainty in both readout and message passing parameters yields enhanced predictive accuracy, calibration, and performance on a downstream molecular search task.
arXiv Detail & Related papers (2020-11-25T22:32:54Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.