Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
- URL: http://arxiv.org/abs/2302.09111v2
- Date: Mon, 31 Jul 2023 21:38:58 GMT
- Title: Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
- Authors: Arhit Chakrabarti, Yang Ni, Ellen Ruth A. Morris, Michael L. Salinas,
Robert S. Chapkin, Bani K. Mallick
- Abstract summary: We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph.
We propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures.
We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell dataset.
- Score: 4.436632973105494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of clustering grouped data with possibly
non-exchangeable groups whose dependencies can be characterized by a known
directed acyclic graph. To allow the sharing of clusters among the
non-exchangeable groups, we propose a Bayesian nonparametric approach, termed
graphical Dirichlet process, that jointly models the dependent group-specific
random measures by assuming each random measure to be distributed as a
Dirichlet process whose concentration parameter and base probability measure
depend on those of its parent groups. The resulting joint stochastic process
respects the Markov property of the directed acyclic graph that links the
groups. We characterize the graphical Dirichlet process using a novel
hypergraph representation as well as the stick-breaking representation, the
restaurant-type representation, and the representation as a limit of a finite
mixture model. We develop an efficient posterior inference algorithm and
illustrate our model with simulations and a real grouped single-cell dataset.
Related papers
- Robust Inference Methods for Latent Group Panel Models under Possible Group Non-Separation [0.0]
This paper presents robust inference methods for general linear hypotheses in linear panel data models with latent group structure in the coefficients.<n>We employ a selective conditional inference approach, deriving the conditional distribution coefficient estimates given the group structure estimated from the data.<n>We demonstrate the effectiveness of our approach through Monte Carlo simulations and apply the methods to two datasets on: (i) the relationship between income and democracy, and (ii) the cyclicality of firm-level R&D investment.
arXiv Detail & Related papers (2025-11-23T17:41:30Z) - HeNCler: Node Clustering in Heterophilous Graphs through Learned Asymmetric Similarity [55.27586970082595]
HeNCler is a novel approach for Heterophilous Node Clustering.
We show that HeNCler significantly enhances performance in node clustering tasks within heterophilous graph contexts.
arXiv Detail & Related papers (2024-05-27T11:04:05Z) - Interpretable Multi-View Clustering Based on Anchor Graph Tensor Factorization [64.00146569922028]
Multi-view clustering methods based on anchor graph factorization lack adequate cluster interpretability for the decomposed matrix.
We address this limitation by using non-negative tensor factorization to decompose an anchor graph tensor that combines anchor graphs from multiple views.
arXiv Detail & Related papers (2024-04-01T03:23:55Z) - Creating generalizable downstream graph models with random projections [22.690120515637854]
We investigate graph representation learning approaches that enable models to generalize across graphs.
We show that using random projections to estimate multiple powers of the transition matrix allows us to build a set of isomorphism-invariant features.
The resulting features can be used to recover enough information about the local neighborhood of a node to enable inference with relevance competitive to other approaches.
arXiv Detail & Related papers (2023-02-17T14:27:00Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Orthogonalization of data via Gromov-Wasserstein type feedback for
clustering and visualization [5.44192123671277]
We propose an adaptive approach for clustering and visualization of data by an orthogonalization process.
We prove that the method converges globally to a unique fixpoint for certain parameter values.
We confirm that the method produces biologically meaningful clustering results consistent with human expert classification.
arXiv Detail & Related papers (2022-07-25T15:52:11Z) - ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial
Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering.
The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - Scaling Graph Clustering with Distributed Sketches [1.1011268090482575]
We present a method inspired by spectral clustering where we instead use matrix sketches derived from random dimension-reducing projections.
We show that our method produces embeddings that yield performant clustering results given a fully-dynamic block model stream.
We also discuss the effects of block model parameters upon the required dimensionality of the subsequent embeddings, and show how random projections could significantly improve the performance of graph clustering in distributed memory.
arXiv Detail & Related papers (2020-07-24T17:38:04Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z) - Blocked Clusterwise Regression [0.0]
We generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple latent variables.
We contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
arXiv Detail & Related papers (2020-01-29T23:29:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.