Related papers: Bayesian Neural Network via Stochastic Gradient Descent

Bayesian Neural Network via Stochastic Gradient Descent

URL: http://arxiv.org/abs/2006.08453v4
Date: Mon, 21 Jun 2021 18:10:42 GMT
Title: Bayesian Neural Network via Stochastic Gradient Descent
Authors: Abhinav Sagar
Abstract summary: We show how gradient estimation can be applied on bayesian neural networks by gradient estimation techniques. Our work considerably beats the previous state of the art approaches for regression using bayesian neural networks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The goal of bayesian approach used in variational inference is to minimize the KL divergence between variational distribution and unknown posterior distribution. This is done by maximizing the Evidence Lower Bound (ELBO). A neural network is used to parametrize these distributions using Stochastic Gradient Descent. This work extends the work done by others by deriving the variational inference models. We show how SGD can be applied on bayesian neural networks by gradient estimation techniques. For validation, we have tested our model on 5 UCI datasets and the metrics chosen for evaluation are Root Mean Square Error (RMSE) error and negative log likelihood. Our work considerably beats the previous state of the art approaches for regression using bayesian neural networks.

Related papers

BALI: Learning Neural Networks via Bayesian Layerwise Inference [6.7819070167076045]
We introduce a new method for learning Bayesian neural networks, treating them as a stack of multivariate Bayesian linear regression models. The main idea is to infer the layerwise posterior exactly if we know the target outputs of each layer. We define these pseudo-targets as the layer outputs from the forward pass, updated by the backpropagated of the objective function.
arXiv Detail & Related papers (2024-11-18T22:18:34Z)
Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$. We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z)
Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference. The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems. This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z)
Differentially private training of neural networks with Langevin dynamics forcalibrated predictive uncertainty [58.730520380312676]
We show that differentially private gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models. This represents a serious issue for safety-critical applications, e.g. in medical diagnosis.
arXiv Detail & Related papers (2021-07-09T08:14:45Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Variational Laplace for Bayesian neural networks [25.055754094939527]
Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2021-02-27T14:06:29Z)
A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Stochastic Bayesian Neural Networks [0.0]
We build on variational inference techniques for bayesian neural networks using the original Evidence Lower Bound. We present a bayesian neural network in which we maximize Evidence Lower Bound using a new objective function which we name as Evidence Lower Bound.
arXiv Detail & Related papers (2020-08-12T19:48:34Z)
A Bayesian regularization-backpropagation neural network model for peeling computations [0.0]
The input data is taken from finite element (FE) peeling results. The neural network is trained with 75% of the FE dataset. It is shown that the BR-BPNN model in conjunction with k-fold technique has significant potential to estimate the peeling behavior.
arXiv Detail & Related papers (2020-06-29T21:58:43Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.