Fast emulation of density functional theory simulations using
approximate Gaussian processes
- URL: http://arxiv.org/abs/2208.11302v1
- Date: Wed, 24 Aug 2022 05:09:36 GMT
- Title: Fast emulation of density functional theory simulations using
approximate Gaussian processes
- Authors: Steven Stetzler, Michael Grosskopf, Earl Lawrence
- Abstract summary: A second statistical model that predicts the simulation output can be used in lieu of the full simulation during model fitting.
We use the emulators to calibrate, in a Bayesian manner, the density functional theory (DFT) model parameters using observed data.
The utility of these DFT models is to make predictions, based on observed data, about the properties of experimentally unobserved nuclides.
- Score: 0.6445605125467573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fitting a theoretical model to experimental data in a Bayesian manner using
Markov chain Monte Carlo typically requires one to evaluate the model thousands
(or millions) of times. When the model is a slow-to-compute physics simulation,
Bayesian model fitting becomes infeasible. To remedy this, a second statistical
model that predicts the simulation output -- an "emulator" -- can be used in
lieu of the full simulation during model fitting. A typical emulator of choice
is the Gaussian process (GP), a flexible, non-linear model that provides both a
predictive mean and variance at each input point. Gaussian process regression
works well for small amounts of training data ($n < 10^3$), but becomes slow to
train and use for prediction when the data set size becomes large. Various
methods can be used to speed up the Gaussian process in the medium-to-large
data set regime ($n > 10^5$), trading away predictive accuracy for drastically
reduced runtime. This work examines the accuracy-runtime trade-off of several
approximate Gaussian process models -- the sparse variational GP, stochastic
variational GP, and deep kernel learned GP -- when emulating the predictions of
density functional theory (DFT) models. Additionally, we use the emulators to
calibrate, in a Bayesian manner, the DFT model parameters using observed data,
resolving the computational barrier imposed by the data set size, and compare
calibration results to previous work. The utility of these calibrated DFT
models is to make predictions, based on observed data, about the properties of
experimentally unobserved nuclides of interest e.g. super-heavy nuclei.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Transfer learning suppresses simulation bias in predictive models built
from sparse, multi-modal data [15.587831925516957]
Many problems in science, engineering, and business require making predictions based on very few observations.
To build a robust predictive model, these sparse data may need to be augmented with simulated data, especially when the design space is multidimensional.
We combine recent developments in deep learning to build more robust predictive models from multimodal data.
arXiv Detail & Related papers (2021-04-19T23:28:32Z) - Designing Accurate Emulators for Scientific Processes using
Calibration-Driven Deep Models [33.935755695805724]
Learn-by-Calibrating (LbC) is a novel deep learning approach for designing emulators in scientific applications.
We show that LbC provides significant improvements in generalization error over widely-adopted loss function choices.
LbC achieves high-quality emulators even in small data regimes and more importantly, recovers the inherent noise structure without any explicit priors.
arXiv Detail & Related papers (2020-05-05T16:54:11Z) - Scaled Vecchia approximation for fast computer-model emulation [0.6445605125467573]
We adapt and extend a powerful class of GP methods from spatial statistics to enable the scalable analysis and emulation of large computer experiments.
Our methods are highly scalable, enabling estimation, joint prediction and simulation in near-linear time in the number of model runs.
arXiv Detail & Related papers (2020-05-01T14:08:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.