Gradient-flow adaptive importance sampling for Bayesian leave one out
cross-validation for sigmoidal classification models
- URL: http://arxiv.org/abs/2402.08151v1
- Date: Tue, 13 Feb 2024 01:03:39 GMT
- Title: Gradient-flow adaptive importance sampling for Bayesian leave one out
cross-validation for sigmoidal classification models
- Authors: Joshua C Chang, Xiangting Li, Shixin Xu, Hao-Ren Yao, Julia Porcino,
Carson Chow
- Abstract summary: gradient-flow-guided adaptive importance sampling (IS) transformations to stabilize Monte-Carlo approximations of point-wise leave one out cross-validated (LOO) predictions for Bayesian classification models.
We derive closed-form exact formulae for Jacobian determinants in the cases of logistic regression and shallow ReLU-activated artificial neural networks.
- Score: 1.0485739694839669
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a set of gradient-flow-guided adaptive importance sampling (IS)
transformations to stabilize Monte-Carlo approximations of point-wise leave one
out cross-validated (LOO) predictions for Bayesian classification models. One
can leverage this methodology for assessing model generalizability by for
instance computing a LOO analogue to the AIC or computing LOO ROC/PRC curves
and derived metrics like the AUROC and AUPRC. By the calculus of variations and
gradient flow, we derive two simple nonlinear single-step transformations that
utilize gradient information to shift a model's pre-trained full-data posterior
closer to the target LOO posterior predictive distributions. In doing so, the
transformations stabilize importance weights. Because the transformations
involve the gradient of the likelihood function, the resulting Monte Carlo
integral depends on Jacobian determinants with respect to the model Hessian. We
derive closed-form exact formulae for these Jacobian determinants in the cases
of logistic regression and shallow ReLU-activated artificial neural networks,
and provide a simple approximation that sidesteps the need to compute full
Hessian matrices and their spectra. We test the methodology on an $n\ll p$
dataset that is known to produce unstable LOO IS weights.
Related papers
- von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Variational Bayesian surrogate modelling with application to robust design optimisation [0.9626666671366836]
Surrogate models provide a quick-to-evaluate approximation to complex computational models.
We consider Bayesian inference for constructing statistical surrogates with input uncertainties and dimensionality reduction.
We demonstrate intrinsic and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs.
arXiv Detail & Related papers (2024-04-23T09:22:35Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - A probabilistic, data-driven closure model for RANS simulations with aleatoric, model uncertainty [1.8416014644193066]
We propose a data-driven, closure model for Reynolds-averaged Navier-Stokes (RANS) simulations that incorporates aleatoric, model uncertainty.
A fully Bayesian formulation is proposed, combined with a sparsity-inducing prior in order to identify regions in the problem domain where the parametric closure is insufficient.
arXiv Detail & Related papers (2023-07-05T16:53:31Z) - Monte Carlo inference for semiparametric Bayesian regression [5.488491124945426]
This paper introduces a simple, general, and efficient strategy for joint posterior inference of an unknown transformation and all regression model parameters.
It delivers (1) joint posterior consistency under general conditions, including multiple model misspecifications, and (2) efficient Monte Carlo (not Markov chain Monte Carlo) inference for the transformation and all parameters for important special cases.
arXiv Detail & Related papers (2023-06-08T18:42:42Z) - Refining Amortized Posterior Approximations using Gradient-Based Summary
Statistics [0.9176056742068814]
We present an iterative framework to improve the amortized approximations of posterior distributions in the context of inverse problems.
We validate our method in a controlled setting by applying it to a stylized problem, and observe improved posterior approximations with each iteration.
arXiv Detail & Related papers (2023-05-15T15:47:19Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.