AdaAnn: Adaptive Annealing Scheduler for Probability Density
Approximation
- URL: http://arxiv.org/abs/2202.00792v1
- Date: Tue, 1 Feb 2022 22:26:18 GMT
- Title: AdaAnn: Adaptive Annealing Scheduler for Probability Density
Approximation
- Authors: Emma R. Cobian, Jonathan D. Hauenstein, Fang Liu and Daniele E.
Schiavazzi
- Abstract summary: Annealing can be used to facilitate Approximating probability distributions over regions of high geometrical complexity.
We introduce AdaAnn, an adaptive scheduler that automatically adjusts the temperature increments based on the expected change in the Kullback-Leibler divergence.
AdaAnn is easy to implement and can be integrated into existing sampling approaches such as normalizing flows for variational inference and Markov chain Monte Carlo.
- Score: 3.1370892256881255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Approximating probability distributions can be a challenging task,
particularly when they are supported over regions of high geometrical
complexity or exhibit multiple modes. Annealing can be used to facilitate this
task which is often combined with constant a priori selected increments in
inverse temperature. However, using constant increments limit the computational
efficiency due to the inability to adapt to situations where smooth changes in
the annealed density could be handled equally well with larger increments. We
introduce AdaAnn, an adaptive annealing scheduler that automatically adjusts
the temperature increments based on the expected change in the Kullback-Leibler
divergence between two distributions with a sufficiently close annealing
temperature. AdaAnn is easy to implement and can be integrated into existing
sampling approaches such as normalizing flows for variational inference and
Markov chain Monte Carlo. We demonstrate the computational efficiency of the
AdaAnn scheduler for variational inference with normalizing flows on a number
of examples, including density approximation and parameter estimation for
dynamical systems.
Related papers
- Asymptotically Optimal Change Detection for Unnormalized Pre- and Post-Change Distributions [65.38208224389027]
This paper addresses the problem of detecting changes when only unnormalized pre- and post-change distributions are accessible.
Our approach is based on the estimation of the Cumulative Sum statistics, which is known to produce optimal performance.
arXiv Detail & Related papers (2024-10-18T17:13:29Z) - Robust scalable initialization for Bayesian variational inference with
multi-modal Laplace approximations [0.0]
Variational mixtures with full-covariance structures suffer from a quadratic growth due to variational parameters with the number of parameters.
We propose a method for constructing an initial Gaussian model approximation that can be used to warm-start variational inference.
arXiv Detail & Related papers (2023-07-12T19:30:04Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Numerically Stable Sparse Gaussian Processes via Minimum Separation
using Cover Trees [57.67528738886731]
We study the numerical stability of scalable sparse approximations based on inducing points.
For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions.
arXiv Detail & Related papers (2022-10-14T15:20:17Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Renormalization group for open quantum systems using environment
temperature as flow parameter [0.0]
We present the $T$-flow renormalization group method, which computes the memory kernel for the density-operator evolution of an open quantum system.
We benchmark in the stationary limit, readily accessible in real-time for voltages on the order of the coupling or larger.
We analytically show that the short-time dynamics of both local and non-local observables follow a universal temperature-independent behaviour.
arXiv Detail & Related papers (2021-11-14T11:52:27Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Online Stochastic Convex Optimization: Wasserstein Distance Variation [15.313864176694832]
We consider an online proximal-gradient method to track the minimizers of expectations of smooth convex functions.
We revisit the concepts of estimation and tracking error inspired by systems and control literature.
We provide bounds for them under strong convexity, Lipschitzness of the gradient, and bounds on the probability distribution drift.
arXiv Detail & Related papers (2020-06-02T05:23:22Z) - Amortized variance reduction for doubly stochastic objectives [17.064916635597417]
Approximate inference in complex probabilistic models requires optimisation of doubly objective functions.
Current approaches do not take into account how mini-batchity affects samplingity, resulting in sub-optimal variance reduction.
We propose a new approach in which we use a recognition network to cheaply approximate the optimal control variate for each mini-batch, with no additional gradient computations.
arXiv Detail & Related papers (2020-03-09T13:23:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.