Hierarchical mixtures of Unigram models for short text clustering: the role of Beta-Liouville priors
- URL: http://arxiv.org/abs/2410.21862v2
- Date: Thu, 14 Nov 2024 09:17:31 GMT
- Title: Hierarchical mixtures of Unigram models for short text clustering: the role of Beta-Liouville priors
- Authors: Massimo Bilancia, Samuele Magro,
- Abstract summary: This paper presents a variant of the Multinomial mixture model tailored for the unsupervised classification of short text data.
We explore an alternative priorational--the Beta-Liouville distribution--which offers a more flexible correlation structure than the Dirichlet.
- Score: 1.03590082373586
- License:
- Abstract: This paper presents a variant of the Multinomial mixture model tailored for the unsupervised classification of short text data. Traditionally, the Multinomial probability vector in this hierarchical model is assigned a Dirichlet prior distribution. Here, however, we explore an alternative prior--the Beta-Liouville distribution--which offers a more flexible correlation structure than the Dirichlet. We examine the theoretical properties of the Beta-Liouville distribution, focusing on its conjugacy with the Multinomial likelihood. This property enables the derivation of update equations for a CAVI (Coordinate Ascent Variational Inference) variational algorithm, facilitating the approximate posterior estimation of model parameters. Additionally, we propose a stochastic variant of the CAVI algorithm that enhances scalability. The paper concludes with data examples that demonstrate effective strategies for setting the Beta-Liouville hyperparameters.
Related papers
- Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Variational autoencoder with weighted samples for high-dimensional
non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function.
In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution.
We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Binary classification based Monte Carlo simulation [0.0]
Bridge between simulation and classification enables us to propose pdf-free versions of pdf-ratio-based simulation algorithms.
From a probabilistic modeling perspective, our procedure involves a structured energy based model which can easily be trained and is compatible with the classical samplers.
arXiv Detail & Related papers (2023-07-29T17:53:31Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM.
We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z) - Probabilistic Embeddings with Laplacian Graph Priors [0.0]
We show that the model unifies several previously proposed embedding methods under one umbrella.
We empirically show that our model matches the performance of previous models as special cases.
We provide code as an implementation enabling flexible estimation in different settings.
arXiv Detail & Related papers (2022-03-25T13:33:51Z) - On Robust Probabilistic Principal Component Analysis using Multivariate
$t$-Distributions [0.30458514384586394]
We present two sets of equivalent relationships between the high-level multivariate $t$-PPCA framework and the hierarchical model used for implementation.
We also propose a novel Monte Carlo expectation-maximization algorithm to implement one general type of such models.
arXiv Detail & Related papers (2020-10-21T06:49:20Z) - Fast Maximum Likelihood Estimation and Supervised Classification for the
Beta-Liouville Multinomial [0.0]
We show that the Beta-Liouville multinomial is comparable in efficiency to the Dirichlet multinomial for Newton-Raphson maximum likelihood estimation.
We also demonstrate that the Beta-Liouville multinomial outperforms the multinomial and Dirichlet multinomial on two out of four gold standard datasets.
arXiv Detail & Related papers (2020-06-12T20:30:12Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z) - CatBoostLSS -- An extension of CatBoost to probabilistic forecasting [91.3755431537592]
We propose a new framework that predicts the entire conditional distribution of a univariable response variable.
CatBoostLSS models all moments of a parametric distribution instead of the conditional mean only.
We present both a simulation study and real-world examples that demonstrate the benefits of our approach.
arXiv Detail & Related papers (2020-01-04T15:42:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.