Distributed MCMC inference for Bayesian Non-Parametric Latent Block
Model
- URL: http://arxiv.org/abs/2402.01050v1
- Date: Thu, 1 Feb 2024 22:43:55 GMT
- Title: Distributed MCMC inference for Bayesian Non-Parametric Latent Block
Model
- Authors: Reda Khoufache, Anisse Belhadj, Hanene Azzag, Mustapha Lebbah
- Abstract summary: We introduce a novel Distributed Markov Chain Monte Carlo (MCMC) inference method for the Bayesian Non-Parametric Latent Block Model (DisNPLBM)
Our non-parametric co-clustering algorithm divides observations and features into partitions using latent multivariate Gaussian block distributions.
DisNPLBM demonstrates its impact on cluster labeling accuracy and execution times through experimental results.
- Score: 0.24578723416255754
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we introduce a novel Distributed Markov Chain Monte Carlo
(MCMC) inference method for the Bayesian Non-Parametric Latent Block Model
(DisNPLBM), employing the Master/Worker architecture. Our non-parametric
co-clustering algorithm divides observations and features into partitions using
latent multivariate Gaussian block distributions. The workload on rows is
evenly distributed among workers, who exclusively communicate with the master
and not among themselves. DisNPLBM demonstrates its impact on cluster labeling
accuracy and execution times through experimental results. Moreover, we present
a real-use case applying our approach to co-cluster gene expression data. The
code source is publicly available at
https://github.com/redakhoufache/Distributed-NPLBM.
Related papers
- Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - Distributed Collapsed Gibbs Sampler for Dirichlet Process Mixture Models
in Federated Learning [0.22499166814992444]
This paper proposes a new distributed Markov Chain Monte Carlo (MCMC) inference method for DPMMs (DisCGS) using sufficient statistics.
Our approach uses the collapsed Gibbs sampler and is specifically designed to work on distributed data across independent and heterogeneous machines.
For instance, with a dataset of 100K data points, the centralized algorithm requires approximately 12 hours to complete 100 iterations while our approach achieves the same number of iterations in just 3 minutes.
arXiv Detail & Related papers (2023-12-18T13:16:18Z) - Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for
Clustering Count Data [0.8499685241219366]
A class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced.
The proposed models are explored in the context of clustering discrete data arising from RNA sequencing studies.
arXiv Detail & Related papers (2023-11-13T21:23:15Z) - Model-based clustering using non-parametric Hidden Markov Models [5.314335654467143]
We study the Bayes risk of clustering when using HMMs and to propose associated clustering procedures.
Results are shown to remain valid in the online setting where observations are clustered sequentially.
arXiv Detail & Related papers (2023-09-21T16:31:04Z) - Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm [21.128416842467132]
We derive a user-friendly centralised distributed MCMC algorithm with provable scaling in high-dimensional settings.
We illustrate the relevance of the proposed methodology on both synthetic and real data experiments.
arXiv Detail & Related papers (2021-06-11T10:37:14Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Generative Semantic Hashing Enhanced via Boltzmann Machines [61.688380278649056]
Existing generative-hashing methods mostly assume a factorized form for the posterior distribution.
We propose to employ the distribution of Boltzmann machine as the retrievalal posterior.
We show that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.
arXiv Detail & Related papers (2020-06-16T01:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.