Nested Variational Inference
- URL: http://arxiv.org/abs/2106.11302v1
- Date: Mon, 21 Jun 2021 17:56:59 GMT
- Title: Nested Variational Inference
- Authors: Heiko Zimmermann, Hao Wu, Babak Esmaeili, Jan-Willem van de Meent
- Abstract summary: We develop a family of methods that learn proposals for nested importance samplers by minimizing a KL divergence at each level of nesting.
We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
- Score: 8.610608901689577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop nested variational inference (NVI), a family of methods that learn
proposals for nested importance samplers by minimizing an forward or reverse KL
divergence at each level of nesting. NVI is applicable to many commonly-used
importance sampling strategies and provides a mechanism for learning
intermediate densities, which can serve as heuristics to guide the sampler. Our
experiments apply NVI to (a) sample from a multimodal distribution using a
learned annealing path (b) learn heuristics that approximate the likelihood of
future observations in a hidden Markov model and (c) to perform amortized
inference in hierarchical deep generative models. We observe that optimizing
nested objectives leads to improved sample quality in terms of log average
weight and effective sample size.
Related papers
- Learning from Different Samples: A Source-free Framework for Semi-supervised Domain Adaptation [20.172605920901777]
This paper focuses on designing a framework to use different strategies for comprehensively mining different target samples.
We propose a novel source-free framework (SOUF) to achieve semi-supervised fine-tuning of the source pre-trained model on the target domain.
arXiv Detail & Related papers (2024-11-11T02:09:32Z) - Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - Self-Guided Generation of Minority Samples Using Diffusion Models [57.319845580050924]
We present a novel approach for generating minority samples that live on low-density regions of a data manifold.
Our framework is built upon diffusion models, leveraging the principle of guided sampling.
Experiments on benchmark real datasets demonstrate that our approach can greatly improve the capability of creating realistic low-likelihood minority instances.
arXiv Detail & Related papers (2024-07-16T10:03:29Z) - Entropy-based Training Methods for Scalable Neural Implicit Sampler [15.978655106034113]
Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning.
In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations.
Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
arXiv Detail & Related papers (2023-06-08T05:56:05Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution
Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Relaxed-Responsibility Hierarchical Discrete VAEs [3.976291254896486]
We introduce textitRelaxed-Responsibility Vector-Quantisation, a novel way to parameterise discrete latent variables.
We achieve state-of-the-art bits-per-dim results for various standard datasets.
arXiv Detail & Related papers (2020-07-14T19:10:05Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.