Related papers: Approximate blocked Gibbs sampling for Bayesian neural networks

Approximate blocked Gibbs sampling for Bayesian neural networks

URL: http://arxiv.org/abs/2208.11389v3
Date: Mon, 24 Jul 2023 15:28:34 GMT
Title: Approximate blocked Gibbs sampling for Bayesian neural networks
Authors: Theodore Papamarkou
Abstract summary: In this work, it is proposed to sample subgroups of parameters via a blocked Gibbs sampling scheme. It is also possible to alleviate vanishing acceptance rates for increasing depth by reducing the proposal variance in deeper layers. An open problem is how to perform minibatch MCMC sampling for feedforward neural networks in the presence of augmented data.
Score: 1.7259824817932292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, minibatch MCMC sampling for feedforward neural networks is made more feasible. To this end, it is proposed to sample subgroups of parameters via a blocked Gibbs sampling scheme. By partitioning the parameter space, sampling is possible irrespective of layer width. It is also possible to alleviate vanishing acceptance rates for increasing depth by reducing the proposal variance in deeper layers. Increasing the length of a non-convergent chain increases the predictive accuracy in classification tasks, so avoiding vanishing acceptance rates and consequently enabling longer chain runs have practical benefits. Moreover, non-convergent chain realizations aid in the quantification of predictive uncertainty. An open problem is how to perform minibatch MCMC sampling for feedforward neural networks in the presence of augmented data.

Related papers

Function-Space MCMC for Bayesian Wide Neural Networks [9.899763598214124]
We investigate the use of the preconditioned Crank-Nicolson algorithm and its Langevin version to sample from the reparametrised posterior distribution of the weights. We prove that the acceptance probabilities of the proposed methods approach 1 as the width of the network increases.
arXiv Detail & Related papers (2024-08-26T14:54:13Z)
Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples. A collapsed sample represents uncountably many models drawn from the approximate posterior. Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z)
Provably Convergent Subgraph-wise Sampling for Fast GNN Training [122.68566970275683]
We propose a novel subgraph-wise sampling method with a convergence guarantee, namely Local Message Compensation (LMC) LMC retrieves the discarded messages in backward passes based on a message passing formulation of backward passes. Experiments on large-scale benchmarks demonstrate that LMC is significantly faster than state-of-the-art subgraph-wise sampling methods.
arXiv Detail & Related papers (2023-03-17T05:16:49Z)
GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference. Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference. GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z)
Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data. We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise [85.59576523297568]
We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by gradient descent. We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution.
arXiv Detail & Related papers (2021-01-04T18:32:49Z)
Salvage Reusable Samples from Noisy Data for Robust Learning [70.48919625304]
We propose a reusable sample selection and correction approach, termed as CRSSC, for coping with label noise in training deep FG models with web images. Our key idea is to additionally identify and correct reusable samples, and then leverage them together with clean examples to update the networks.
arXiv Detail & Related papers (2020-08-06T02:07:21Z)
Nonconvex regularization for sparse neural networks [0.0]
We show that a non regularization method is investigated in the context of shallow ReLU networks. We show that the network approximation guarantees existing bounds on the size for finite data are maintained.
arXiv Detail & Related papers (2020-04-24T03:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.