Related papers: Stochastic Gradient MCMC with Multi-Armed Bandit Tuning

Stochastic Gradient MCMC with Multi-Armed Bandit Tuning

URL: http://arxiv.org/abs/2105.13059v2
Date: Fri, 28 May 2021 13:49:38 GMT
Title: Stochastic Gradient MCMC with Multi-Armed Bandit Tuning
Authors: Jeremie Coullon, Leah South, Christopher Nemeth
Abstract summary: We propose a novel bandit-based algorithm that tunes SGMCMC hyperparameters to maximize the accuracy of the posterior approximation. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of application areas.
Score: 2.2559617939136505
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of algorithms for scalable Bayesian inference. However, these algorithms include hyperparameters such as step size or batch size that influence the accuracy of estimators based on the obtained samples. As a result, these hyperparameters must be tuned by the practitioner and currently no principled and automated way to tune them exists. Standard MCMC tuning methods based on acceptance rates cannot be used for SGMCMC, thus requiring alternative tools and diagnostics. We propose a novel bandit-based algorithm that tunes SGMCMC hyperparameters to maximize the accuracy of the posterior approximation by minimizing the kernel Stein discrepancy (KSD). We provide theoretical results supporting this approach and assess alternative metrics to KSD. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of application areas.

Related papers

Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization [21.206714676842317]
We propose a general adaptation framework for tuning the Markov kernels in SMC samplers. We provide a gradient- and tuning-free algorithm that is generally applicable for kernels such as Langevin Monte Carlo (LMC) Our implementations are able to obtain a full textitschedule of tuned parameters at the cost of a few vanilla SMC runs.
arXiv Detail & Related papers (2025-03-19T21:35:02Z)
Tuning diagonal scale matrices for HMC [0.0]
Three approaches for adaptively tuning diagonal scale matrices for HMC are discussed and compared. The common practice of scaling according to estimated marginal standard deviations is taken as a benchmark. Scaling according to the mean log-target gradient (ISG) and a scaling method targeting that the frequency of when the underlying Hamiltonian dynamics crosses the respective medians are alternatives.
arXiv Detail & Related papers (2024-03-12T10:35:40Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Online Probabilistic Model Identification using Adaptive Recursive MCMC [8.465242072268019]
We suggest the Adaptive Recursive Markov Chain Monte Carlo (ARMCMC) method. It eliminates the shortcomings of conventional online techniques while computing the entire probability density function of model parameters. We demonstrate our approach using parameter estimation in a soft bending actuator and the Hunt-Crossley dynamic model.
arXiv Detail & Related papers (2022-10-23T02:06:48Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior. We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z)
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC [25.47669573608621]
We propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties. Empirically, we test the algorithm through extensive experiments on various setups and obtain the results.
arXiv Detail & Related papers (2020-08-12T15:02:59Z)
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces [53.47210316424326]
KeRNS is an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes. We prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time.
arXiv Detail & Related papers (2020-07-09T21:37:13Z)
Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy. EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.