Statistical and Computational Trade-offs in Variational Inference: A
Case Study in Inferential Model Selection
- URL: http://arxiv.org/abs/2207.11208v2
- Date: Sun, 6 Aug 2023 05:38:40 GMT
- Title: Statistical and Computational Trade-offs in Variational Inference: A
Case Study in Inferential Model Selection
- Authors: Kush Bhatia, Nikki Lijing Kuang, Yi-An Ma, Yixin Wang
- Abstract summary: Variational inference has emerged as a popular alternative to the classical Markov chain Monte Carlo.
We study the statistical and computational trade-offs in variational inference via a case study in inferential model selection.
We prove that, given a fixed computation budget, a lower-rank inferential model produces variational posteriors with a higher statistical approximation error.
- Score: 27.817156428797567
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Variational inference has recently emerged as a popular alternative to the
classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference.
The core idea is to trade statistical accuracy for computational efficiency. In
this work, we study these statistical and computational trade-offs in
variational inference via a case study in inferential model selection. Focusing
on Gaussian inferential models (or variational approximating families) with
diagonal plus low-rank precision matrices, we initiate a theoretical study of
the trade-offs in two aspects, Bayesian posterior inference error and
frequentist uncertainty quantification error. From the Bayesian posterior
inference perspective, we characterize the error of the variational posterior
relative to the exact posterior. We prove that, given a fixed computation
budget, a lower-rank inferential model produces variational posteriors with a
higher statistical approximation error, but a lower computational error; it
reduces variance in stochastic optimization and, in turn, accelerates
convergence. From the frequentist uncertainty quantification perspective, we
consider the precision matrix of the variational posterior as an uncertainty
estimate, which involves an additional statistical error originating from the
sampling uncertainty of the data. As a consequence, for small datasets, the
inferential model need not be full-rank to achieve optimal estimation error
(even with unlimited computation budget).
Related papers
- Variational Bayesian surrogate modelling with application to robust design optimisation [0.9626666671366836]
Surrogate models provide a quick-to-evaluate approximation to complex computational models.
We consider Bayesian inference for constructing statistical surrogates with input uncertainties and dimensionality reduction.
We demonstrate intrinsic and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs.
arXiv Detail & Related papers (2024-04-23T09:22:35Z) - Scalable Bayesian inference for the generalized linear mixed model [2.45365913654612]
We introduce a statistical inference algorithm at the intersection of AI and Bayesian inference.
Our algorithm is an extension of gradient MCMC with novel contributions that address the treatment of correlated data.
We apply our algorithm to a large electronic health records database.
arXiv Detail & Related papers (2024-03-05T14:35:34Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Theoretical characterization of uncertainty in high-dimensional linear
classification [24.073221004661427]
We show that uncertainty for learning from limited number of samples of high-dimensional input data and labels can be obtained by the approximate message passing algorithm.
We discuss how over-confidence can be mitigated by appropriately regularising, and show that cross-validating with respect to the loss leads to better calibration than with the 0/1 error.
arXiv Detail & Related papers (2022-02-07T15:32:07Z) - Aleatoric uncertainty for Errors-in-Variables models in deep regression [0.48733623015338234]
We show how the concept of Errors-in-Variables can be used in Bayesian deep regression.
We discuss the approach along various simulated and real examples.
arXiv Detail & Related papers (2021-05-19T12:37:02Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Maximum likelihood estimation and uncertainty quantification for
Gaussian process approximation of deterministic functions [10.319367855067476]
This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset.
We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model.
arXiv Detail & Related papers (2020-01-29T17:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.