On Divergence Measures for Bayesian Pseudocoresets
- URL: http://arxiv.org/abs/2210.06205v1
- Date: Wed, 12 Oct 2022 13:45:36 GMT
- Title: On Divergence Measures for Bayesian Pseudocoresets
- Authors: Balhae Kim, Jungwon Choi, Seanie Lee, Yoonho Lee, Jung-Woo Ha, Juho
Lee
- Abstract summary: A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset.
This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets.
We provide a unifying view of such divergence measures in Bayesian pseudocoreset construction.
- Score: 28.840995981326028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Bayesian pseudocoreset is a small synthetic dataset for which the posterior
over parameters approximates that of the original dataset. While promising, the
scalability of Bayesian pseudocoresets is not yet validated in realistic
problems such as image classification with deep neural networks. On the other
hand, dataset distillation methods similarly construct a small dataset such
that the optimization using the synthetic dataset converges to a solution with
performance competitive with optimization using full data. Although dataset
distillation has been empirically verified in large-scale settings, the
framework is restricted to point estimates, and their adaptation to Bayesian
inference has not been explored. This paper casts two representative dataset
distillation algorithms as approximations to methods for constructing
pseudocoresets by minimizing specific divergence measures: reverse KL
divergence and Wasserstein distance. Furthermore, we provide a unifying view of
such divergence measures in Bayesian pseudocoreset construction. Finally, we
propose a novel Bayesian pseudocoreset algorithm based on minimizing forward KL
divergence. Our empirical results demonstrate that the pseudocoresets
constructed from these methods reflect the true posterior even in
high-dimensional Bayesian inference problems.
Related papers
- Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Function Space Bayesian Pseudocoreset for Bayesian Neural Networks [16.952160718249292]
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset.
In this paper, we propose a novel Bayesian pseudocoreset construction method that operates on a function space.
By working directly on the function space, our method could bypass several challenges that may arise when working on a weight space.
arXiv Detail & Related papers (2023-10-27T02:04:31Z) - Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
arXiv Detail & Related papers (2023-03-20T17:13:50Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Bayesian inference via sparse Hamiltonian flows [16.393322369105864]
A Bayesian coreset is a small, weighted subset of data that replaces the full dataset during Bayesian inference.
Current methods tend to be slow, require a secondary inference step after coreset construction, and do not provide bounds on the data marginal evidence.
We introduce a new method -- sparse Hamiltonian flows -- that addresses all three of these challenges.
arXiv Detail & Related papers (2022-03-11T02:36:59Z) - Semiparametric Bayesian Networks [5.205440005969871]
We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions.
Their aim is to incorporate the bounded complexity of parametric models and the flexibility of nonparametric ones.
arXiv Detail & Related papers (2021-09-07T11:47:32Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Bayesian Coresets: Revisiting the Nonconvex Optimization Perspective [30.963638533636352]
We propose and analyze a novel algorithm for coreset selection.
We provide explicit convergence rate guarantees and present an empirical evaluation on a variety of benchmark datasets.
arXiv Detail & Related papers (2020-07-01T19:34:59Z) - Spatially Adaptive Inference with Stochastic Feature Sampling and
Interpolation [72.40827239394565]
We propose to compute features only at sparsely sampled locations.
We then densely reconstruct the feature map with an efficient procedure.
The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
arXiv Detail & Related papers (2020-03-19T15:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.