Approximate blocked Gibbs sampling for Bayesian neural networks
- URL: http://arxiv.org/abs/2208.11389v3
- Date: Mon, 24 Jul 2023 15:28:34 GMT
- Title: Approximate blocked Gibbs sampling for Bayesian neural networks
- Authors: Theodore Papamarkou
- Abstract summary: In this work, it is proposed to sample subgroups of parameters via a blocked Gibbs sampling scheme.
It is also possible to alleviate vanishing acceptance rates for increasing depth by reducing the proposal variance in deeper layers.
An open problem is how to perform minibatch MCMC sampling for feedforward neural networks in the presence of augmented data.
- Score: 1.7259824817932292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, minibatch MCMC sampling for feedforward neural networks is made
more feasible. To this end, it is proposed to sample subgroups of parameters
via a blocked Gibbs sampling scheme. By partitioning the parameter space,
sampling is possible irrespective of layer width. It is also possible to
alleviate vanishing acceptance rates for increasing depth by reducing the
proposal variance in deeper layers. Increasing the length of a non-convergent
chain increases the predictive accuracy in classification tasks, so avoiding
vanishing acceptance rates and consequently enabling longer chain runs have
practical benefits. Moreover, non-convergent chain realizations aid in the
quantification of predictive uncertainty. An open problem is how to perform
minibatch MCMC sampling for feedforward neural networks in the presence of
augmented data.
Related papers
- Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples.
A collapsed sample represents uncountably many models drawn from the approximate posterior.
Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z) - Provably Convergent Subgraph-wise Sampling for Fast GNN Training [63.530816506578674]
We propose a novel subgraph-wise sampling method with a convergence guarantee, namely Local Message Compensation (LMC)
LMC retrieves the discarded messages in backward passes based on a message passing formulation of backward passes.
Experiments on large-scale benchmarks demonstrate that LMC is significantly faster than state-of-the-art subgraph-wise sampling methods.
arXiv Detail & Related papers (2023-03-17T05:16:49Z) - GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference.
Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference.
GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z) - Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - achieve good predictive performance for a given mini-batch size.
Varying the size of the mini-batches enables a natural calibration of the predictive distribution.
We expect PBNN to be particularly suited for cases when data sets are distributed across multiple decentralized devices.
arXiv Detail & Related papers (2022-10-17T14:43:35Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Provable Generalization of SGD-trained Neural Networks of Any Width in
the Presence of Adversarial Label Noise [85.59576523297568]
We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by gradient descent.
We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution.
arXiv Detail & Related papers (2021-01-04T18:32:49Z) - Salvage Reusable Samples from Noisy Data for Robust Learning [70.48919625304]
We propose a reusable sample selection and correction approach, termed as CRSSC, for coping with label noise in training deep FG models with web images.
Our key idea is to additionally identify and correct reusable samples, and then leverage them together with clean examples to update the networks.
arXiv Detail & Related papers (2020-08-06T02:07:21Z) - Nonconvex regularization for sparse neural networks [0.0]
We show that a non regularization method is investigated in the context of shallow ReLU networks.
We show that the network approximation guarantees existing bounds on the size for finite data are maintained.
arXiv Detail & Related papers (2020-04-24T03:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.