Scalable Inference for Bayesian Multinomial Logistic-Normal Dynamic Linear Models
- URL: http://arxiv.org/abs/2410.05548v1
- Date: Mon, 7 Oct 2024 23:20:14 GMT
- Title: Scalable Inference for Bayesian Multinomial Logistic-Normal Dynamic Linear Models
- Authors: Manan Saxena, Tinghua Chen, Justin D. Silverman,
- Abstract summary: This article develops an efficient and accurate approach to posterior state estimation, called $textitFenrir$.
Our experiments suggest that Fenrir can be three orders of magnitude more efficient than Stan.
Our methods are made available to the community as a user-friendly software library written in C++ with an R interface.
- Score: 0.5735035463793009
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many scientific fields collect longitudinal count compositional data. Each observation is a multivariate count vector, where the total counts are arbitrary, and the information lies in the relative frequency of the counts. Multiple authors have proposed Bayesian Multinomial Logistic-Normal Dynamic Linear Models (MLN-DLMs) as a flexible approach to modeling these data. However, adoption of these methods has been limited by computational challenges. This article develops an efficient and accurate approach to posterior state estimation, called $\textit{Fenrir}$. Our approach relies on a novel algorithm for MAP estimation and an accurate approximation to a key posterior marginal of the model. As there are no equivalent methods against which we can compare, we also develop an optimized Stan implementation of MLN-DLMs. Our experiments suggest that Fenrir can be three orders of magnitude more efficient than Stan and can even be incorporated into larger sampling schemes for joint inference of model hyperparameters. Our methods are made available to the community as a user-friendly software library written in C++ with an R interface.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Multidimensional Item Response Theory in the Style of Collaborative
Filtering [0.8057006406834467]
This paper presents a machine learning approach to multidimensional item response theory (MIRT)
Inspired by collaborative filtering, we define a general class of models that includes many MIRT models.
We discuss the use of penalized joint maximum likelihood (JML) to estimate individual models and cross-validation to select the best performing model.
arXiv Detail & Related papers (2023-01-03T00:56:27Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Robust leave-one-out cross-validation for high-dimensional Bayesian
models [0.0]
Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy.
Here we propose and analyze a novel mixture estimator to compute LOO-CV criteria.
Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite variance of the resulting estimators.
arXiv Detail & Related papers (2022-09-19T17:14:52Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Multitarget Tracking with Transformers [21.81266872964314]
Multitarget Tracking (MTT) is a problem of tracking the states of an unknown number of objects using noisy measurements.
In this paper, we propose a high-performing deep-learning method for MTT based on the Transformer architecture.
arXiv Detail & Related papers (2021-04-01T19:14:55Z) - Fast covariance parameter estimation of spatial Gaussian process models
using neural networks [0.0]
We train NNs to take moderate size spatial fields or variograms as input and return the range and noise-to-signal covariance parameters.
Once trained, the NNs provide estimates with a similar accuracy compared to ML estimation and at a speedup by a factor of 100 or more.
This work can be easily extended to other, more complex, spatial problems and provides a proof-of-concept for this use of machine learning in computational statistics.
arXiv Detail & Related papers (2020-12-30T22:06:26Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - PIANO: A Fast Parallel Iterative Algorithm for Multinomial and Sparse
Multinomial Logistic Regression [0.0]
We show that PIANO can be easily extended to solve the Sparse Multinomial Logistic Regression problem.
We also prove that PIANO converges to a stationary point of the Multinomial and the Sparse Multinomial Logistic Regression problems.
arXiv Detail & Related papers (2020-02-21T05:15:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.