Bayesian Perceptron: Towards fully Bayesian Neural Networks
- URL: http://arxiv.org/abs/2009.01730v2
- Date: Thu, 10 Sep 2020 05:02:40 GMT
- Title: Bayesian Perceptron: Towards fully Bayesian Neural Networks
- Authors: Marco F. Huber
- Abstract summary: Training and predictions of a perceptron are performed within the Bayesian inference framework in closed-form.
The weights and the predictions of the perceptron are considered Gaussian random variables.
This approach requires no computationally expensive gradient calculations and further allows sequential learning.
- Score: 5.5510642465908715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial neural networks (NNs) have become the de facto standard in machine
learning. They allow learning highly nonlinear transformations in a plethora of
applications. However, NNs usually only provide point estimates without
systematically quantifying corresponding uncertainties. In this paper a novel
approach towards fully Bayesian NNs is proposed, where training and predictions
of a perceptron are performed within the Bayesian inference framework in
closed-form. The weights and the predictions of the perceptron are considered
Gaussian random variables. Analytical expressions for predicting the
perceptron's output and for learning the weights are provided for commonly used
activation functions like sigmoid or ReLU. This approach requires no
computationally expensive gradient calculations and further allows sequential
learning.
Related papers
- Linearization Turns Neural Operators into Function-Valued Gaussian Processes [23.85470417458593]
We introduce LUNO, a novel framework for approximate Bayesian uncertainty quantification in trained neural operators.
Our approach leverages model linearization to push (Gaussian) weight-space uncertainty forward to the neural operator's predictions.
We show that this can be interpreted as a probabilistic version of the concept of currying from functional programming, yielding a function-valued (Gaussian) random process belief.
arXiv Detail & Related papers (2024-06-07T16:43:54Z) - Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control [0.0]
We show that ReLU networks with randomly generated weights and biases achieve $L_infty$ error of $O(m-1/2)$ with high probability.
We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
arXiv Detail & Related papers (2024-03-25T19:39:17Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference.
The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems.
This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Learnable Uncertainty under Laplace Approximations [65.24701908364383]
We develop a formalism to explicitly "train" the uncertainty in a decoupled way to the prediction itself.
We show that such units can be trained via an uncertainty-aware objective, improving standard Laplace approximations' performance.
arXiv Detail & Related papers (2020-10-06T13:43:33Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.