Function Space Bayesian Pseudocoreset for Bayesian Neural Networks
- URL: http://arxiv.org/abs/2310.17852v1
- Date: Fri, 27 Oct 2023 02:04:31 GMT
- Title: Function Space Bayesian Pseudocoreset for Bayesian Neural Networks
- Authors: Balhae Kim, Hyungi Lee, Juho Lee
- Abstract summary: A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset.
In this paper, we propose a novel Bayesian pseudocoreset construction method that operates on a function space.
By working directly on the function space, our method could bypass several challenges that may arise when working on a weight space.
- Score: 16.952160718249292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential
information of a large-scale dataset and thus can be used as a proxy dataset
for scalable Bayesian inference. Typically, a Bayesian pseudocoreset is
constructed by minimizing a divergence measure between the posterior
conditioning on the pseudocoreset and the posterior conditioning on the full
dataset. However, evaluating the divergence can be challenging, particularly
for the models like deep neural networks having high-dimensional parameters. In
this paper, we propose a novel Bayesian pseudocoreset construction method that
operates on a function space. Unlike previous methods, which construct and
match the coreset and full data posteriors in the space of model parameters
(weights), our method constructs variational approximations to the coreset
posterior on a function space and matches it to the full data posterior in the
function space. By working directly on the function space, our method could
bypass several challenges that may arise when working on a weight space,
including limited scalability and multi-modality issue. Through various
experiments, we demonstrate that the Bayesian pseudocoresets constructed from
our method enjoys enhanced uncertainty quantification and better robustness
across various model architectures.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
arXiv Detail & Related papers (2023-03-20T17:13:50Z) - Provable Data Subset Selection For Efficient Neural Network Training [73.34254513162898]
We introduce the first algorithm to construct coresets for emphRBFNNs, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network.
We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets.
arXiv Detail & Related papers (2023-03-09T10:08:34Z) - Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory.
We show that linear networks make provably optimal predictions at infinite depth.
We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z) - Black-box Coreset Variational Inference [13.892427580424444]
We present a black-box variational inference framework for coresets to enable principled application of variational coresets to intractable models.
We apply our techniques to supervised learning problems, and compare them with existing approaches in the literature for data summarization and inference.
arXiv Detail & Related papers (2022-11-04T11:12:09Z) - On Divergence Measures for Bayesian Pseudocoresets [28.840995981326028]
A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset.
This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets.
We provide a unifying view of such divergence measures in Bayesian pseudocoreset construction.
arXiv Detail & Related papers (2022-10-12T13:45:36Z) - Feature Space Particle Inference for Neural Network Ensembles [13.392254060510666]
Particle-based inference methods offer a promising approach from a Bayesian perspective.
We propose optimizing particles in the feature space where the activation of a specific intermediate layer lies.
Our method encourages each member to capture distinct features, which is expected to improve ensemble prediction robustness.
arXiv Detail & Related papers (2022-06-02T09:16:26Z) - $\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the
Presence of Outliers [14.918826474979587]
The quality of classic Bayesian inference depends critically on whether observations conform with the assumed data generating model.
We propose a variational inference method that, in a principled way, can simultaneously scale to large datasets.
We illustrate the applicability of our approach in diverse simulated and real datasets, and various statistical models.
arXiv Detail & Related papers (2020-08-31T13:47:12Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Spatially Adaptive Inference with Stochastic Feature Sampling and
Interpolation [72.40827239394565]
We propose to compute features only at sparsely sampled locations.
We then densely reconstruct the feature map with an efficient procedure.
The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
arXiv Detail & Related papers (2020-03-19T15:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.