Adaptive importance sampling for heavy-tailed distributions via
$\alpha$-divergence minimization
- URL: http://arxiv.org/abs/2310.16653v1
- Date: Wed, 25 Oct 2023 14:07:08 GMT
- Title: Adaptive importance sampling for heavy-tailed distributions via
$\alpha$-divergence minimization
- Authors: Thomas Guilmeau and Nicola Branchini and Emilie Chouzenoux and
V\'ictor Elvira
- Abstract summary: We propose an AIS algorithm that approximates the target by Student-t proposal distributions.
We adapt location and scale parameters by matching the escort moments of the target and the proposal.
These updates minimize the $alpha$-divergence between the target and the proposal, thereby connecting with variational inference.
- Score: 2.879807093604632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adaptive importance sampling (AIS) algorithms are widely used to approximate
expectations with respect to complicated target probability distributions. When
the target has heavy tails, existing AIS algorithms can provide inconsistent
estimators or exhibit slow convergence, as they often neglect the target's tail
behaviour. To avoid this pitfall, we propose an AIS algorithm that approximates
the target by Student-t proposal distributions. We adapt location and scale
parameters by matching the escort moments - which are defined even for
heavy-tailed distributions - of the target and the proposal. These updates
minimize the $\alpha$-divergence between the target and the proposal, thereby
connecting with variational inference. We then show that the
$\alpha$-divergence can be approximated by a generalized notion of effective
sample size and leverage this new perspective to adapt the tail parameter with
Bayesian optimization. We demonstrate the efficacy of our approach through
applications to synthetic targets and a Bayesian Student-t regression task on a
real example with clinical trial data.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization [10.009748368458409]
We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity.
Our method enables fully differentiable approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep learning.
arXiv Detail & Related papers (2023-07-07T13:06:12Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Sequential Domain Adaptation by Synthesizing Distributionally Robust
Experts [14.656957226255628]
Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution close to the target distribution.
We use the Bernstein online aggregation algorithm on the proposed family of robust experts to generate predictions for the sequential stream of target samples.
arXiv Detail & Related papers (2021-06-01T08:51:55Z) - Amortized variance reduction for doubly stochastic objectives [17.064916635597417]
Approximate inference in complex probabilistic models requires optimisation of doubly objective functions.
Current approaches do not take into account how mini-batchity affects samplingity, resulting in sub-optimal variance reduction.
We propose a new approach in which we use a recognition network to cheaply approximate the optimal control variate for each mini-batch, with no additional gradient computations.
arXiv Detail & Related papers (2020-03-09T13:23:14Z) - Scalable Approximate Inference and Some Applications [2.6541211006790983]
In this thesis, we propose a new framework for approximate inference.
Our proposed four algorithms are motivated by the recent computational progress of Stein's method.
Results on simulated and real datasets indicate the statistical efficiency and wide applicability of our algorithm.
arXiv Detail & Related papers (2020-03-07T04:33:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.