Related papers: Posterior and Computational Uncertainty in Gaussian Processes

Posterior and Computational Uncertainty in Gaussian Processes

URL: http://arxiv.org/abs/2205.15449v5
Date: Mon, 9 Oct 2023 20:14:51 GMT
Title: Posterior and Computational Uncertainty in Gaussian Processes
Authors: Jonathan Wenger, Geoff Pleiss, Marvin Pf\"ortner, Philipp Hennig, John P. Cunningham
Abstract summary: Gaussian processes scale prohibitively with the size of the dataset. Many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. We develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended.
Score: 52.26904059556759
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian processes scale prohibitively with the size of the dataset. In response, many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. Therefore in practice, GP models are often as much about the approximation method as they are about the data. Here, we develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended. The most common GP approximations map to an instance in this class, such as methods based on the Cholesky factorization, conjugate gradients, and inducing points. For any method in this class, we prove (i) convergence of its posterior mean in the associated RKHS, (ii) decomposability of its combined posterior covariance into mathematical and computational covariances, and (iii) that the combined variance is a tight worst-case bound for the squared error between the method's posterior mean and the latent function. Finally, we empirically demonstrate the consequences of ignoring computational uncertainty and show how implicitly modeling it improves generalization performance on benchmark datasets.

Related papers

Quantitative Error Bounds for Scaling Limits of Stochastic Iterative Algorithms [10.022615790746466]
We derive non-asymptotic functional approximation error bounds between the algorithm sample paths and the Ornstein-Uhlenbeck approximation.<n>We use our main result to construct error bounds in terms of two common metrics: the L'evy-Prokhorov and bounded Wasserstein distances.
arXiv Detail & Related papers (2025-01-21T15:29:11Z)
Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data [9.913418444556486]
We show how iterative methods can be used to reduce the computational costs for calculating likelihoods, gradients, and predictive distributions with FSAs. We also present a novel, accurate, and fast way to calculate predictive variances relying on estimations and iterative methods. All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2024-05-23T12:25:22Z)
Partially factorized variational inference for high-dimensional mixed models [0.0]
Variational inference is a popular way to perform such computations, especially in the Bayesian context. We show that standard mean-field variational inference dramatically underestimates posterior uncertainty in high-dimensions. We then show how appropriately relaxing the mean-field assumption leads to methods whose uncertainty quantification does not deteriorate in high-dimensions.
arXiv Detail & Related papers (2023-12-20T16:12:37Z)
Accelerating Non-Conjugate Gaussian Processes By Trading Off Computation For Uncertainty [27.34933282665653]
Non-conjugate Gaussian processes (NCGPs) define a flexible probabilistic framework to model categorical, ordinal and continuous data. The approximation error adversely impacts the reliability of the model and is not accounted for in the uncertainty of the prediction. We introduce a family of iterative methods that explicitly model this error.
arXiv Detail & Related papers (2023-10-31T08:58:16Z)
Iterative Methods for Vecchia-Laplace Approximations for Latent Gaussian Process Models [11.141688859736805]
We introduce and analyze several preconditioners, derive new convergence results, and propose novel methods for accurately approxing predictive variances. In particular, we obtain a speed-up of an order of magnitude compared to Cholesky-based calculations. All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2023-10-18T14:31:16Z)
PDE-constrained Gaussian process surrogate modeling with uncertain data locations [1.943678022072958]
We propose a Bayesian approach that integrates the variability of input data into the Gaussian process regression for function and partial differential equation approximation. A consistently good performance of generalization is observed, and a substantial reduction in the predictive uncertainties is achieved.
arXiv Detail & Related papers (2023-05-19T10:53:08Z)
Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning [48.08663378234329]
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications. Existing sparse approximation methods can yield a significant reduction in the computational cost. We provide novel confidence intervals for the Nystr"om method and the sparse variational Gaussian processes approximation method.
arXiv Detail & Related papers (2022-02-08T17:22:09Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z)
The Shooting Regressor; Randomized Gradient-Based Ensembles [0.0]
An ensemble method is introduced that utilizes randomization and loss function gradients to compute a prediction. Multiple weakly-correlated estimators approximate the gradient at randomly sampled points on the error surface and are aggregated into a final solution.
arXiv Detail & Related papers (2020-09-14T03:20:59Z)
Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks. We use a mean-field approximation formula to compute an analytically intractable integral. Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck [14.309243378538012]
We propose an online compression scheme that fixes an error neighborhood with respect to the Hellinger metric centered at the current posterior. For constant error radius, POG converges to a neighborhood of the population posterior (Theorem 1(ii))but with finite memory at-worst determined by the metric entropy of the feature space.
arXiv Detail & Related papers (2020-04-23T11:52:06Z)
SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features. We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.