Tractable Approximate Gaussian Inference for Bayesian Neural Networks
- URL: http://arxiv.org/abs/2004.09281v3
- Date: Tue, 7 Dec 2021 13:37:54 GMT
- Title: Tractable Approximate Gaussian Inference for Bayesian Neural Networks
- Authors: James-A. Goulet, Luong Ha Nguyen and Saeid Amiri
- Abstract summary: We propose an analytical method for performing tractable approximate Gaussian inference (TAGI) in Bayesian neural networks.
The method has a computational complexity of $mathcalO(n)$ with respect to the number of parameters $n$, and the tests performed on regression and classification benchmarks confirm that, for a same network architecture, it matches the performance of existing methods relying on gradient backpropagation.
- Score: 1.933681537640272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose an analytical method for performing tractable
approximate Gaussian inference (TAGI) in Bayesian neural networks. The method
enables the analytical Gaussian inference of the posterior mean vector and
diagonal covariance matrix for weights and biases. The method proposed has a
computational complexity of $\mathcal{O}(n)$ with respect to the number of
parameters $n$, and the tests performed on regression and classification
benchmarks confirm that, for a same network architecture, it matches the
performance of existing methods relying on gradient backpropagation.
Related papers
- Verification of Geometric Robustness of Neural Networks via Piecewise Linear Approximation and Lipschitz Optimisation [57.10353686244835]
We address the problem of verifying neural networks against geometric transformations of the input image, including rotation, scaling, shearing, and translation.
The proposed method computes provably sound piecewise linear constraints for the pixel values by using sampling and linear approximations in combination with branch-and-bound Lipschitz.
We show that our proposed implementation resolves up to 32% more verification cases than present approaches.
arXiv Detail & Related papers (2024-08-23T15:02:09Z) - An Optimal Transport Approach for Network Regression [0.6238182916866519]
We build upon recent developments in generalized regression models on metric spaces based on Fr'echet means.
We propose a network regression method using the Wasserstein metric.
arXiv Detail & Related papers (2024-06-18T02:03:07Z) - A Structure-Preserving Kernel Method for Learning Hamiltonian Systems [3.594638299627404]
A structure-preserving kernel ridge regression method is presented that allows the recovery of potentially high-dimensional and nonlinear Hamiltonian functions.
The paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required.
A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters.
arXiv Detail & Related papers (2024-03-15T07:20:21Z) - Bayesian Approach to Linear Bayesian Networks [3.8711489380602804]
The proposed approach iteratively estimates each element of the topological ordering from backward and its parent using the inverse of a partial covariance matrix.
The proposed method is demonstrated to outperform state-of-the-art frequentist approaches, such as the BHLSM, LISTEN, and TD algorithms in synthetic data.
arXiv Detail & Related papers (2023-11-27T08:10:53Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Pathwise Conditioning of Gaussian Processes [72.61885354624604]
Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations.
This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector.
We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors.
arXiv Detail & Related papers (2020-11-08T17:09:37Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Disentangling the Gauss-Newton Method and Approximate Inference for
Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning.
We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly.
The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z) - Distributed Stochastic Nonconvex Optimization and Learning based on
Successive Convex Approximation [26.11677569331688]
We introduce a novel framework for the distributed algorithmic minimization of the sum of the sum of the agents in a network.
We show that the proposed method can be applied to distributed neural networks.
arXiv Detail & Related papers (2020-04-30T15:36:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.