Coreset Markov Chain Monte Carlo
- URL: http://arxiv.org/abs/2310.17063v2
- Date: Sat, 9 Mar 2024 04:42:02 GMT
- Title: Coreset Markov Chain Monte Carlo
- Authors: Naitong Chen, Trevor Campbell
- Abstract summary: State of the art methods for tuning coreset weights are expensive, require nontrivial user input, and impose constraints on the model.
We propose a new method -- Coreset MCMC -- that simulates a Markov chain targeting the coreset posterior, while simultaneously updating the coreset weights.
- Score: 15.310842498680483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Bayesian coreset is a small, weighted subset of data that replaces the full
dataset during inference in order to reduce computational cost. However, state
of the art methods for tuning coreset weights are expensive, require nontrivial
user input, and impose constraints on the model. In this work, we propose a new
method -- Coreset MCMC -- that simulates a Markov chain targeting the coreset
posterior, while simultaneously updating the coreset weights using those same
draws. Coreset MCMC is simple to implement and tune, and can be used with any
existing MCMC kernel. We analyze Coreset MCMC in a representative setting to
obtain key insights about the convergence behaviour of the method. Empirical
results demonstrate that Coreset MCMC provides higher quality posterior
approximations and reduced computational cost compared with other coreset
construction methods. Further, compared with other general subsampling MCMC
methods, we find that Coreset MCMC has a higher sampling efficiency with
competitively accurate posterior approximations.
Related papers
- Tuning-free coreset Markov chain Monte Carlo [14.360996967498]
A Bayesian coreset is a small, weighted subset of a data set that replaces the full data during inference to reduce computational cost.
Coreset Markov chain Monte Carlo (Coreset MCMC) uses draws from an adaptive Markov chain targeting the coreset to train the coreset weights.
We propose a learning-rate-free gradient optimization procedure, Hot-start Distance over Gradient (Hot DoG)
Empirical results demonstrate that Hot DoG provides higher quality posterior approximations than other learning-rate-free gradient methods.
arXiv Detail & Related papers (2024-10-24T17:59:23Z) - MCMC-driven learning [64.94438070592365]
This paper is intended to appear as a chapter for the Handbook of Monte Carlo.
The goal of this paper is to unify various problems at the intersection of Markov chain learning.
arXiv Detail & Related papers (2024-02-14T22:10:42Z) - Refined Coreset Selection: Towards Minimal Coreset Size under Model
Performance Constraints [69.27190330994635]
Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.
We propose an innovative method, which maintains optimization priority order over the model performance and coreset size.
Empirically, extensive experiments confirm its superiority, often yielding better model performance with smaller coreset sizes.
arXiv Detail & Related papers (2023-11-15T03:43:04Z) - Tangent Model Composition for Ensembling and Continual Fine-tuning [69.92177580782929]
Tangent Model Composition (TMC) is a method to combine component models independently fine-tuned around a pre-trained point.
TMC improves accuracy by 4.2% compared to ensembling non-linearly fine-tuned models.
arXiv Detail & Related papers (2023-07-16T17:45:33Z) - Knowledge Removal in Sampling-based Bayesian Inference [86.14397783398711]
When single data deletion requests come, companies may need to delete the whole models learned with massive resources.
Existing works propose methods to remove knowledge learned from data for explicitly parameterized models.
In this paper, we propose the first machine unlearning algorithm for MCMC.
arXiv Detail & Related papers (2022-03-24T10:03:01Z) - DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm [21.128416842467132]
We derive a user-friendly centralised distributed MCMC algorithm with provable scaling in high-dimensional settings.
We illustrate the relevance of the proposed methodology on both synthetic and real data experiments.
arXiv Detail & Related papers (2021-06-11T10:37:14Z) - Stochastic Gradient MCMC with Multi-Armed Bandit Tuning [2.2559617939136505]
We propose a novel bandit-based algorithm that tunes SGMCMC hyperparameters to maximize the accuracy of the posterior approximation.
We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of application areas.
arXiv Detail & Related papers (2021-05-27T11:00:31Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks
with Symmetric Splitting [6.684193501969829]
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo approach that exhibits favourable exploration properties in high-dimensional models such as neural networks.
We introduce a new integration scheme for split HMC that does not rely on symmetric gradients.
Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.
arXiv Detail & Related papers (2020-10-14T01:58:34Z) - Non-convex Learning via Replica Exchange Stochastic Gradient MCMC [25.47669573608621]
We propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties.
Empirically, we test the algorithm through extensive experiments on various setups and obtain the results.
arXiv Detail & Related papers (2020-08-12T15:02:59Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.