Scaling Bayesian inference of mixed multinomial logit models to very
large datasets
- URL: http://arxiv.org/abs/2004.05426v1
- Date: Sat, 11 Apr 2020 15:30:47 GMT
- Title: Scaling Bayesian inference of mixed multinomial logit models to very
large datasets
- Authors: Filipe Rodrigues
- Abstract summary: We propose an Amortized Variational Inference approach that leverages backpropagation, automatic differentiation and GPU-accelerated computation.
We show how normalizing flows can be used to increase the flexibility of the variational posterior approximations.
- Score: 9.442139459221785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational inference methods have been shown to lead to significant
improvements in the computational efficiency of approximate Bayesian inference
in mixed multinomial logit models when compared to standard Markov-chain Monte
Carlo (MCMC) methods without compromising accuracy. However, despite their
demonstrated efficiency gains, existing methods still suffer from important
limitations that prevent them to scale to very large datasets, while providing
the flexibility to allow for rich prior distributions and to capture complex
posterior distributions. In this paper, we propose an Amortized Variational
Inference approach that leverages stochastic backpropagation, automatic
differentiation and GPU-accelerated computation, for effectively scaling
Bayesian inference in Mixed Multinomial Logit models to very large datasets.
Moreover, we show how normalizing flows can be used to increase the flexibility
of the variational posterior approximations. Through an extensive simulation
study, we empirically show that the proposed approach is able to achieve
computational speedups of multiple orders of magnitude over traditional MSLE
and MCMC approaches for large datasets without compromising estimation
accuracy.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Gradient-free variational learning with conditional mixture networks [39.827869318925494]
Conditional mixture networks (CMNs) are suitable for fast, gradient-free inference and can solve complex classification tasks.
We validate this approach by training two-layer CMNs on standard benchmarks from the UCI repository.
Our method, CAVI-CMN, achieves competitive and often superior predictive accuracy compared to maximum likelihood estimation (MLE) with backpropagation.
arXiv Detail & Related papers (2024-08-29T10:43:55Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
arXiv Detail & Related papers (2023-03-20T17:13:50Z) - Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data [0.0]
This research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy.
Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS.
arXiv Detail & Related papers (2022-11-28T21:00:55Z) - $\beta$-Cores: Robust Large-Scale Bayesian Data Summarization in the
Presence of Outliers [14.918826474979587]
The quality of classic Bayesian inference depends critically on whether observations conform with the assumed data generating model.
We propose a variational inference method that, in a principled way, can simultaneously scale to large datasets.
We illustrate the applicability of our approach in diverse simulated and real datasets, and various statistical models.
arXiv Detail & Related papers (2020-08-31T13:47:12Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of
Multimodal Posteriors [8.11978827493967]
We propose an approach using parallel runs of MCMC, variational, or mode-based inference to hit as many modes as possible.
We present theoretical consistency with an example where the stacked inference process approximates the true data.
We demonstrate practical implementation in several model families.
arXiv Detail & Related papers (2020-06-22T15:26:59Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.