Related papers: Variational EP with Probabilistic Backpropagation for Bayesian Neural Networks

Variational EP with Probabilistic Backpropagation for Bayesian Neural Networks

URL: http://arxiv.org/abs/2303.01540v1
Date: Thu, 2 Mar 2023 19:09:47 GMT
Title: Variational EP with Probabilistic Backpropagation for Bayesian Neural Networks
Authors: Kehinde Olobatuyi
Abstract summary: I propose a novel approach for nonlinear Logistic regression using a two-layer neural network (NN) model structure with hierarchical priors on the network weights. I derive a computationally efficient algorithm, whose complexity scales similarly to an ensemble of independent sparse logistic models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: I propose a novel approach for nonlinear Logistic regression using a two-layer neural network (NN) model structure with hierarchical priors on the network weights. I present a hybrid of expectation propagation called Variational Expectation Propagation approach (VEP) for approximate integration over the posterior distribution of the weights, the hierarchical scale parameters of the priors and zeta. Using a factorized posterior approximation I derive a computationally efficient algorithm, whose complexity scales similarly to an ensemble of independent sparse logistic models. The approach can be extended beyond standard activation functions and NN model structures to form flexible nonlinear binary predictors from multiple sparse linear models. I consider a hierarchical Bayesian model with logistic regression likelihood and a Gaussian prior distribution over the parameters called weights and hyperparameters. I work in the perspective of E step and M step for computing the approximating posterior and updating the parameters using the computed posterior respectively.

Related papers

Neural Optimal Transport Meets Multivariate Conformal Prediction [58.43397908730771]
We propose a framework for conditional vectorile regression (CVQR)<n>CVQR combines neural optimal transport with quantized optimization, and apply it to predictions.
arXiv Detail & Related papers (2025-09-29T19:50:19Z)
Optimal Condition for Initialization Variance in Deep Neural Networks: An SGD Dynamics Perspective [0.0]
gradient descent (SGD) is one of the most fundamental optimization algorithms in machine learning (ML)<n>We study the relationship between the quasi-stationary distribution derived from this equation and the initial distribution through the Kullback-Leibler (KL) divergence.<n>We experimentally confirm our theoretical results by using the classical SGD to train fully connected neural networks on the MNIST and Fashion-MNIST datasets.
arXiv Detail & Related papers (2025-08-18T11:18:12Z)
Posterior and variational inference for deep neural networks with heavy-tailed weights [0.0]
We consider deep neural networks in a Bayesian framework with a prior distribution sampling the network weights at random. We show that the corresponding posterior distribution achieves near-optimal minimax contraction rates. We also provide variational Bayes counterparts of the results, that show that mean-field variational approximations still benefit from near-optimal theoretical support.
arXiv Detail & Related papers (2024-06-05T15:24:20Z)
A variational neural Bayes framework for inference on intractable posterior distributions [1.0801976288811024]
Posterior distributions of model parameters are efficiently obtained by feeding observed data into a trained neural network. We show theoretically that our posteriors converge to the true posteriors in Kullback-Leibler divergence.
arXiv Detail & Related papers (2024-04-16T20:40:15Z)
SPDE priors for uncertainty quantification of end-to-end neural data assimilation schemes [4.213142548113385]
Recent advances in the deep learning community enables to adress this problem as neural architecture embedding data assimilation variational framework. In this work, we draw from SPDE-based Processes to estimate prior models able to handle non-stationary covariances in both space and time. Our neural variational scheme is modified to embed an augmented state formulation with both state SPDE parametrization to estimate.
arXiv Detail & Related papers (2024-02-02T19:18:12Z)
A Bayesian Take on Gaussian Process Networks [1.7188280334580197]
This work implements Monte Carlo and Markov Chain Monte Carlo methods to sample from the posterior distribution of network structures. We show that our method outperforms state-of-the-art algorithms in recovering the graphical structure of the network.
arXiv Detail & Related papers (2023-06-20T08:38:31Z)
Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network. We use a single GFlowNet whose sampling policy follows a two-phase process. Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z)
Prior Density Learning in Variational Bayesian Phylogenetic Parameters Inference [1.03590082373586]
We propose an approach to relax the rigidity of the prior densities by learning their parameters using a gradient-based method and a neural network-based parameterization. The results of performed simulations show that the approach is powerful in estimating branch lengths and evolutionary model parameters.
arXiv Detail & Related papers (2023-02-06T01:29:15Z)
Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables. We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption. We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Bayesian Neural Network Inference via Implicit Models and the Posterior Predictive Distribution [0.8122270502556371]
We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks. The approach is more scalable to large data than Markov Chain Monte Carlo. We see this being useful in applications such as surrogate and physics-based models.
arXiv Detail & Related papers (2022-09-06T02:43:19Z)
Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer. In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph. Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z)
Estimation of Switched Markov Polynomial NARX models [75.91002178647165]
We identify a class of models for hybrid dynamical systems characterized by nonlinear autoregressive (NARX) components. The proposed approach is demonstrated on a SMNARX problem composed by three nonlinear sub-models with specific regressors.
arXiv Detail & Related papers (2020-09-29T15:00:47Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.