Related papers: AdamMCMC: Combining Metropolis Adjusted Langevin with Momentum-based Optimization

AdamMCMC: Combining Metropolis Adjusted Langevin with Momentum-based Optimization

URL: http://arxiv.org/abs/2312.14027v2
Date: Thu, 15 Aug 2024 18:00:14 GMT
Title: AdamMCMC: Combining Metropolis Adjusted Langevin with Momentum-based Optimization
Authors: Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen, Mathias Trabs,
Abstract summary: Uncertainty estimation is a key issue when considering the application of deep neural network methods in science and engineering. We introduce a novel algorithm that quantifies uncertainty via Monte Carlo sampling from a tempered posterior distribution.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Uncertainty estimation is a key issue when considering the application of deep neural network methods in science and engineering. In this work, we introduce a novel algorithm that quantifies epistemic uncertainty via Monte Carlo sampling from a tempered posterior distribution. It combines the well established Metropolis Adjusted Langevin Algorithm (MALA) with momentum-based optimization using Adam and leverages a prolate proposal distribution, to efficiently draw from the posterior. We prove that the constructed chain admits the Gibbs posterior as an invariant distribution and converges to this Gibbs posterior in total variation distance. Furthermore, we demonstrate the efficiency of the resulting algorithm and the merit of the proposed changes on a state-of-the-art classifier from high-energy particle physics.

Related papers

Neural Network Approach to Stochastic Dynamics for Smooth Multimodal Density Estimation [0.0]
We extent Metropolis-Adjusted Langevin Diffusion algorithm by modelling the Eigenity of precondition matrix as a random matrix. The proposed method provides fully adaptation mechanisms to tune proposal densities to exploits and adapts the geometry of local structures of statistical models.
arXiv Detail & Related papers (2025-03-22T16:17:12Z)
Improving sample efficiency of high dimensional Bayesian optimization with MCMC [7.241485121318798]
We propose a new method based on Markov Chain Monte Carlo to efficiently sample from an approximated posterior. We show experimentally that both the Metropolis-Hastings and the Langevin Dynamics version of our algorithm outperform state-of-the-art methods in high-dimensional sequential optimization and reinforcement learning benchmarks.
arXiv Detail & Related papers (2024-01-05T05:56:42Z)
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z)
Proximal Algorithms for Accelerated Langevin Dynamics [57.08271964961975]
We develop a novel class of MCMC algorithms based on a stochastized Nesterov scheme. We show superior performance of the proposed method over typical Langevin samplers for different models in statistics and image processing.
arXiv Detail & Related papers (2023-11-24T19:56:01Z)
Maximum Weight Entropy [6.821961232645206]
This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. Considering neural networks, a practical optimization is derived to build such a distribution, defined as a trade-off between the average empirical risk and the weight distribution entropy.
arXiv Detail & Related papers (2023-09-27T14:46:10Z)
Noise-Free Sampling Algorithms via Regularized Wasserstein Proximals [3.4240632942024685]
We consider the problem of sampling from a distribution governed by a potential function. This work proposes an explicit score based MCMC method that is deterministic, resulting in a deterministic evolution for particles.
arXiv Detail & Related papers (2023-08-28T23:51:33Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables. We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption. We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z)
On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space. The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot. Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z)
The split Gibbs sampler revisited: improvements to its algorithmic structure and augmented target distribution [1.1279808969568252]
Current state-of-the-art methods often address these difficulties by replacing the posterior density with a smooth approximation. An alternative approach is based on data augmentation and relaxation, where auxiliary variables are introduced in order to construct an approximate augmented posterior distribution. This paper proposes a new accelerated proximal MCMC method called latent space SK-ROCK, which tightly combines the benefits of the two strategies.
arXiv Detail & Related papers (2022-06-28T11:21:41Z)
Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits. We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z)
What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles. We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z)
Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions. We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions. We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.