Related papers: Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles

Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles

URL: http://arxiv.org/abs/2305.01754v1
Date: Tue, 2 May 2023 19:41:17 GMT
Title: Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles
Authors: Aik Rui Tan, Shingo Urata, Samuel Goldman, Johannes C.B. Dietschreit and Rafael G\'omez-Bombarelli
Abstract summary: Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution. Uncertainty quantification (UQ) is a challenge when they are employed to model interatomic potentials in materials systems. Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials.
Score: 0.7499722271664145
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution, making uncertainty quantification (UQ) a challenge. When they are employed to model interatomic potentials in materials systems, this problem leads to unphysical structures that disrupt simulations, or to biased statistics and dynamics that do not reflect the true physics. Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials. However, a variety of UQ techniques, including newly developed ones, exist for atomistic simulations and there are no clear guidelines for which are most effective or suitable for a given case. In this work, we examine multiple UQ schemes for improving the robustness of NN interatomic potentials (NNIPs) through active learning. In particular, we compare incumbent ensemble-based methods against strategies that use single, deterministic NNs: mean-variance estimation, deep evidential regression, and Gaussian mixture models. We explore three datasets ranging from in-domain interpolative learning to more extrapolative out-of-domain generalization challenges: rMD17, ammonia inversion, and bulk silica glass. Performance is measured across multiple metrics relating model error to uncertainty. Our experiments show that none of the methods consistently outperformed each other across the various metrics. Ensembling remained better at generalization and for NNIP robustness; MVE only proved effective for in-domain interpolation, while GMM was better out-of-domain; and evidential regression, despite its promise, was not the preferable alternative in any of the cases. More broadly, cost-effective, single deterministic models cannot yet consistently match or outperform ensembling for uncertainty quantification in NNIPs.

Related papers

Partially-Supervised Neural Network Model For Quadratic Multiparametric Programming [2.765106384328772]
This study proposes a partially-supervised NN architecture that directly represents the mathematical structure of the global solution function.<n>In contrast to generic NN training approaches, the proposed PSNN method derives a large proportion of model weights directly from the mathematical properties of the optimization problem.
arXiv Detail & Related papers (2025-06-05T20:26:18Z)
Fréchet Cumulative Covariance Net for Deep Nonlinear Sufficient Dimension Reduction with Random Objects [22.156257535146004]
We introduce a new statistical dependence measure termed Fr'echet Cumulative Covariance (FCCov) and develop a novel nonlinear SDR framework based on FCCov. Our approach is not only applicable to complex non-Euclidean data, but also exhibits robustness against outliers. We prove that our method with squared Frobenius norm regularization achieves unbiasedness at the $sigma$-field level.
arXiv Detail & Related papers (2025-02-21T10:55:50Z)
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance. Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z)
DiffHybrid-UQ: Uncertainty Quantification for Differentiable Hybrid Neural Modeling [4.76185521514135]
We introduce a novel method, DiffHybrid-UQ, for effective and efficient uncertainty propagation and estimation in hybrid neural differentiable models. Specifically, our approach effectively discerns and quantifies both aleatoric uncertainties, arising from data noise, and epistemic uncertainties, resulting from model-form discrepancies and data sparsity.
arXiv Detail & Related papers (2023-12-30T07:40:47Z)
Uncertainty Quantification in Multivariable Regression for Material Property Prediction with Bayesian Neural Networks [37.69303106863453]
We introduce an approach for uncertainty quantification (UQ) within physics-informed BNNs. We present case studies for predicting the creep rupture life of steel alloys. The most promising framework for creep life prediction is BNNs based on Markov Chain Monte Carlo approximation of the posterior distribution of network parameters.
arXiv Detail & Related papers (2023-11-04T19:40:16Z)
MISNN: Multiple Imputation via Semi-parametric Neural Networks [9.594714330925703]
Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research. We propose MISNN, a novel and efficient algorithm that incorporates feature selection for MI.
arXiv Detail & Related papers (2023-05-02T21:45:36Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
A Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric Statistics [41.99023989695363]
We explore the potential for semiparametric theory to be used to improve neural networks and machine learning algorithms. We propose a new neural network method MultiNet, which seeks the flexibility and diversity of an ensemble using a single architecture.
arXiv Detail & Related papers (2022-02-18T09:35:51Z)
Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features. We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models. We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data. We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.