Related papers: A Non-asymptotic Analysis for Learning and Applying a Preconditioner in MCMC

A Non-asymptotic Analysis for Learning and Applying a Preconditioner in MCMC

URL: http://arxiv.org/abs/2602.10714v1
Date: Wed, 11 Feb 2026 10:19:56 GMT
Title: A Non-asymptotic Analysis for Learning and Applying a Preconditioner in MCMC
Authors: Max Hird, Florian Maire, Jeffrey Negrea,
Abstract summary: We analyse and compare the finite-time computational costs of schemes which learn a preconditioner.<n>We establish non-asymptotic guarantees on the time taken to collect $N$ approximately independent samples from the target for schemes that learn their preconditioners.
Score: 2.1920579994942164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Preconditioning is a common method applied to modify Markov chain Monte Carlo algorithms with the goal of making them more efficient. In practice it is often extremely effective, even when the preconditioner is learned from the chain. We analyse and compare the finite-time computational costs of schemes which learn a preconditioner based on the target covariance or the expected Hessian of the target potential with that of a corresponding scheme that does not use preconditioning. We apply our results to the Unadjusted Langevin Algorithm (ULA) for an appropriately regular target, establishing non-asymptotic guarantees for preconditioned ULA which learns its preconditioner. Our results are also applied to the unadjusted underdamped Langevin algorithm in the supplementary material. To do so, we establish non-asymptotic guarantees on the time taken to collect $N$ approximately independent samples from the target for schemes that learn their preconditioners under the assumption that the underlying Markov chain satisfies a contraction condition in the Wasserstein-2 distance. This approximate independence condition, that we formalize, allows us to bridge the non-asymptotic bounds of modern MCMC theory and classical heuristics of effective sample size and mixing time, and is needed to amortise the costs of learning a preconditioner across the many samples it will be used to produce.

Related papers

Towards Anytime-Valid Statistical Watermarking [63.02116925616554]
We develop the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference.<n>Our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.
arXiv Detail & Related papers (2026-02-19T18:32:26Z)
Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem [4.769747792846004]
We consider average-reward Reinforcement Learning with unbounded state space and reward function.<n>Recent works studied this problem in the actor-critic framework.<n>We study Temporal Difference (TD) learning with linear function approximation.
arXiv Detail & Related papers (2024-10-29T03:40:53Z)
PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates [17.777466668123886]
We introduce PROMISE ($textbfPr$econditioned $textbfO$ptimization $textbfM$ethods by $textbfI$ncorporating $textbfS$calable Curvature $textbfE$stimates), a suite of sketching-based preconditioned gradient algorithms. PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha.
arXiv Detail & Related papers (2023-09-05T07:49:10Z)
When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z)
Optimal Preconditioning and Fisher Adaptive Langevin Sampling [8.122270502556374]
We derive a computationally efficient adaptive MCMC scheme that learns the preconditioning from the history of gradients produced as the algorithm runs. We show in several experiments that the proposed algorithm is very robust in high dimensions and significantly outperforms other methods.
arXiv Detail & Related papers (2023-05-23T18:07:44Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
Reframed GES with a Neural Conditional Dependence Measure [20.47061693587848]
We revisit the Greedy Equivalence Search (GES) algorithm, which is widely cited as a score-based algorithm for learning the Markov equivalence class (MEC) We present a reframing of the GES algorithm, which is more flexible than the standard score-based version. We propose a neural conditional dependence measure, which utilizes the expressive power of deep neural networks.
arXiv Detail & Related papers (2022-06-17T03:29:08Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Minimally Entangled Typical Thermal States Algorithms for Finite Temperature Matsubara Green Functions [0.0]
We extend finite-temperature tensor network methods to compute Matsubara imaginary-time correlation functions. As a benchmark, we study the single-band Anderson impurity model. Results are competitive with state-of-the-art continuous time Monte Carlo.
arXiv Detail & Related papers (2021-07-29T13:02:25Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.