Related papers: Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Deep Neural Networks as Point Estimates for Deep Gaussian Processes

URL: http://arxiv.org/abs/2105.04504v1
Date: Mon, 10 May 2021 16:55:17 GMT
Title: Deep Neural Networks as Point Estimates for Deep Gaussian Processes
Authors: Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani, Nicolas Durrande
Abstract summary: We propose a sparse variational approximation for DGPs for which the approximate posterior mean has the same mathematical structure as a Deep Neural Network (DNN) We make the forward pass through a DGP equivalent to a ReLU DNN by finding an interdomain transformation that represents the GP posterior mean as a sum of ReLU basis functions. Experiments demonstrate improved accuracy and faster training compared to current DGP methods, while retaining favourable predictive uncertainties.
Score: 44.585609003513625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Gaussian processes (DGPs) have struggled for relevance in applications due to the challenges and cost associated with Bayesian inference. In this paper we propose a sparse variational approximation for DGPs for which the approximate posterior mean has the same mathematical structure as a Deep Neural Network (DNN). We make the forward pass through a DGP equivalent to a ReLU DNN by finding an interdomain transformation that represents the GP posterior mean as a sum of ReLU basis functions. This unification enables the initialisation and training of the DGP as a neural network, leveraging the well established practice in the deep learning community, and so greatly aiding the inference task. The experiments demonstrate improved accuracy and faster training compared to current DGP methods, while retaining favourable predictive uncertainties.

Related papers

STRIDE: Sparse Techniques for Regression in Deep Gaussian Processes [0.3277163122167433]
We develop a particle-based expectation expectation training method for deep GP training on large-scale data.<n>We test our method on standard benchmark problems.
arXiv Detail & Related papers (2025-05-16T15:18:15Z)
From Deep Additive Kernel Learning to Last-Layer Bayesian Neural Networks via Induced Prior Approximation [11.917792144592056]
We propose the Deep Additive Kernel (DAK) model, which incorporates an additive structure for the last-layer GP. The proposed method enjoys the interpretability of DKL as well as the computational advantages of BNN. Empirical results show that the proposed approach outperforms state-of-the-art DKL methods in both regression and classification tasks.
arXiv Detail & Related papers (2025-02-14T20:14:17Z)
Fixed-Mean Gaussian Processes for Post-hoc Bayesian Deep Learning [11.22428369342346]
We introduce a novel family of sparse variational Gaussian processes (GPs), where the posterior mean is fixed to any continuous function when using a universal kernel. Specifically, we fix the mean of this GP to the output of the pre-trained DNN, allowing our approach to effectively fit the GP's predictive variances to estimate the prediction uncertainty. Experimental results demonstrate that FMGP improves both uncertainty estimation and computational efficiency when compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-12-05T14:17:16Z)
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z)
Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains [7.936841911281107]
We propose cvHM, a general inference framework for latent GP models leveraging Hida-Mat'ern kernels and conjugate variational inference (CVI) We are able to perform variational inference of latent neural trajectories with linear time complexity for arbitrary likelihoods.
arXiv Detail & Related papers (2023-06-01T16:31:36Z)
Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks [0.0]
For regression tasks, standard Gaussian processes (GPs) provide natural uncertainty quantification, while deep neural networks (DNNs) excel at representation learning. We propose to combine these two approaches in a hybrid method consisting of an ensemble of GPs built on the output of hidden layers of a DNN.
arXiv Detail & Related papers (2023-05-26T16:19:26Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Guided Deep Kernel Learning [42.53025115287688]
We present a novel approach for learning deep kernels by utilizing infinite-width neural networks. Our approach harnesses the reliable uncertainty estimation of the NNGPs to adapt the DKL target confidence when it encounters novel data points.
arXiv Detail & Related papers (2023-02-19T13:37:34Z)
Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them. NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z)
Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary. With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions. The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z)
Differentially private training of neural networks with Langevin dynamics forcalibrated predictive uncertainty [58.730520380312676]
We show that differentially private gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models. This represents a serious issue for safety-critical applications, e.g. in medical diagnosis.
arXiv Detail & Related papers (2021-07-09T08:14:45Z)
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process. This gives us a better understanding of the implicit prior NNs place on function space. We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z)
Predicting the outputs of finite deep neural networks trained with noisy gradients [1.1470070927586014]
A recent line of works studied wide deep neural networks (DNNs) by approximating them as Gaussian Processes (GPs) Here we consider a DNN training protocol involving noise, weight decay and finite width, whose outcome corresponds to a certain non-Gaussian process. An analytical framework is then introduced to analyze this non-Gaussian process, whose deviation from a GP is controlled by the finite width.
arXiv Detail & Related papers (2020-04-02T18:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.