Forward $\chi^2$ Divergence Based Variational Importance Sampling
- URL: http://arxiv.org/abs/2311.02516v2
- Date: Fri, 2 Feb 2024 09:46:20 GMT
- Title: Forward $\chi^2$ Divergence Based Variational Importance Sampling
- Authors: Chengrui Li, Yule Wang, Weihan Li and Anqi Wu
- Abstract summary: We introduce a novel variational importance sampling (VIS) approach that directly estimates and maximizes the log-likelihood.
We apply VIS to various popular latent variable models, including mixture models, variational auto-encoders, and partially observable generalized linear models.
- Score: 2.841087763205822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Maximizing the log-likelihood is a crucial aspect of learning latent variable
models, and variational inference (VI) stands as the commonly adopted method.
However, VI can encounter challenges in achieving a high log-likelihood when
dealing with complicated posterior distributions. In response to this
limitation, we introduce a novel variational importance sampling (VIS) approach
that directly estimates and maximizes the log-likelihood. VIS leverages the
optimal proposal distribution, achieved by minimizing the forward $\chi^2$
divergence, to enhance log-likelihood estimation. We apply VIS to various
popular latent variable models, including mixture models, variational
auto-encoders, and partially observable generalized linear models. Results
demonstrate that our approach consistently outperforms state-of-the-art
baselines, both in terms of log-likelihood and model parameter estimation.
Related papers
- Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - Rényi Neural Processes [14.11793373584558]
We propose R'enyi Neural Processes (RNP) to ameliorate the impacts of prior misspecification.
We scale the density ratio $fracpq$ by the power of (1-$alpha$) in the divergence gradients with respect to the posterior.
Our experiments show consistent log-likelihood improvements over state-of-the-art NP family models.
arXiv Detail & Related papers (2024-05-25T00:14:55Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Lazy Estimation of Variable Importance for Large Neural Networks [22.95405462638975]
We propose a fast and flexible method for approximating the reduced model with important inferential guarantees.
We demonstrate our method is fast and accurate under several data-generating regimes, and we demonstrate its real-world applicability on a seasonal climate forecasting example.
arXiv Detail & Related papers (2022-07-19T06:28:17Z) - Generalised Gaussian Process Latent Variable Models (GPLVM) with
Stochastic Variational Inference [9.468270453795409]
We study the doubly formulation of the BayesianVM model amenable with minibatch training.
We show how this framework is compatible with different latent variable formulations and perform experiments to compare a suite of models.
We demonstrate how we can train in the presence of massively missing data and obtain high-fidelity reconstructions.
arXiv Detail & Related papers (2022-02-25T21:21:51Z) - Time varying regression with hidden linear dynamics [74.9914602730208]
We revisit a model for time-varying linear regression that assumes the unknown parameters evolve according to a linear dynamical system.
Counterintuitively, we show that when the underlying dynamics are stable the parameters of this model can be estimated from data by combining just two ordinary least squares estimates.
arXiv Detail & Related papers (2021-12-29T23:37:06Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Improved Prediction and Network Estimation Using the Monotone Single
Index Multi-variate Autoregressive Model [34.529641317832024]
We develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM)
We provide theoretical guarantees for dependent data and an alternating projected gradient descent algorithm.
We demonstrate the superior performance both on simulated data and two real data examples.
arXiv Detail & Related papers (2021-06-28T12:32:29Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.