A Class of Dependent Random Distributions Based on Atom Skipping
- URL: http://arxiv.org/abs/2304.14954v2
- Date: Sat, 30 Dec 2023 16:12:00 GMT
- Title: A Class of Dependent Random Distributions Based on Atom Skipping
- Authors: Dehua Bi and Yuan Ji
- Abstract summary: We propose the Plaid Atoms Model (PAM), a novel Bayesian nonparametric model for grouped data.
PAM produces a dependent clustering pattern with overlapping and non-overlapping clusters across groups.
- Score: 2.3258287344692676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose the Plaid Atoms Model (PAM), a novel Bayesian nonparametric model
for grouped data. Founded on an idea of `atom skipping', PAM is part of a
well-established category of models that generate dependent random
distributions and clusters across multiple groups. Atom skipping referrs to
stochastically assigning 0 weights to atoms in an infinite mixture. Deploying
atom skipping across groups, PAM produces a dependent clustering pattern with
overlapping and non-overlapping clusters across groups. As a result,
interpretable posterior inference is possible such as reporting the posterior
probability of a cluster being exclusive to a single group or shared among a
subset of groups. We discuss the theoretical properties of the proposed and
related models. Minor extensions of the proposed model for multivariate or
count data are presented. Simulation studies and applications using real-world
datasets illustrate the performance of the new models with comparison to
existing models.
Related papers
- Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for
Clustering Count Data [0.8499685241219366]
A class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced.
The proposed models are explored in the context of clustering discrete data arising from RNA sequencing studies.
arXiv Detail & Related papers (2023-11-13T21:23:15Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - clusterBMA: Bayesian model averaging for clustering [1.2021605201770345]
We introduce clusterBMA, a method that enables weighted model averaging across results from unsupervised clustering algorithms.
We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model.
In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters.
arXiv Detail & Related papers (2022-09-09T04:55:20Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - On the Generative Utility of Cyclic Conditionals [103.1624347008042]
We study whether and how can we model a joint distribution $p(x,z)$ using two conditional models $p(x|z)$ that form a cycle.
We propose the CyGen framework for cyclic-conditional generative modeling, including methods to enforce compatibility and use the determined distribution to fit and generate data.
arXiv Detail & Related papers (2021-06-30T10:23:45Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Blocked Clusterwise Regression [0.0]
We generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple latent variables.
We contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
arXiv Detail & Related papers (2020-01-29T23:29:31Z) - A new model for natural groupings in high-dimensional data [0.4604003661048266]
Clustering aims to divide a set of points into groups.
Recent experiments have uncovered several high-dimensional datasets that form different binary groupings.
This paper describes a probability model for the data that could explain this phenomenon.
arXiv Detail & Related papers (2019-09-14T02:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.