Dependent Cluster Mapping (DCMAP): Optimal clustering of directed
acyclic graphs for statistical inference
- URL: http://arxiv.org/abs/2308.03970v3
- Date: Wed, 7 Feb 2024 23:18:38 GMT
- Title: Dependent Cluster Mapping (DCMAP): Optimal clustering of directed
acyclic graphs for statistical inference
- Authors: Paul Pao-Yen Wu, Fabrizio Ruggeri, Kerrie Mengersen
- Abstract summary: A Directed Acyclic Graph (DAG) can be partitioned or mapped into clusters to support inference.
We propose a novel algorithm called DCMAP for optimal cluster mapping with dependent clusters.
For a 25 and 50-node DBN, the search space size was $9.91times 109$ and $1.51times1021$ possible cluster mappings, and the first optimal solution was found at 934 $(text95% CI 926,971)$.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A Directed Acyclic Graph (DAG) can be partitioned or mapped into clusters to
support and make inference more computationally efficient in Bayesian Network
(BN), Markov process and other models. However, optimal partitioning with an
arbitrary cost function is challenging, especially in statistical inference as
the local cluster cost is dependent on both nodes within a cluster, and the
mapping of clusters connected via parent and/or child nodes, which we call
dependent clusters. We propose a novel algorithm called DCMAP for optimal
cluster mapping with dependent clusters. Given an arbitrarily defined, positive
cost function based on the DAG, we show that DCMAP converges to find all
optimal clusters, and returns near-optimal solutions along the way.
Empirically, we find that the algorithm is time-efficient for a Dynamic BN
(DBN) model of a seagrass complex system using a computation cost function. For
a 25 and 50-node DBN, the search space size was $9.91\times 10^9$ and
$1.51\times10^{21}$ possible cluster mappings, and the first optimal solution
was found at iteration 934 $(\text{95\% CI } 926,971)$, and 2256 $(2150,2271)$
with a cost that was 4\% and 0.2\% of the naive heuristic cost, respectively.
Related papers
- TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization [2.4783546111391215]
Density-based clustering methods by mode-seeking usually achieve clustering by using local density estimation to mine structural information.
We propose a new algorithm (TANGO) to establish local dependencies by exploiting a global-view emphtypicality of points.
It achieves final clustering by employing graph-cut on sub-clusters, thus avoiding the challenging selection of cluster centers.
arXiv Detail & Related papers (2024-08-19T15:26:25Z) - Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems.
Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution.
We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z) - MeanCut: A Greedy-Optimized Graph Clustering via Path-based Similarity
and Degree Descent Criterion [0.6906005491572401]
spectral clustering is popular and attractive due to the remarkable performance, easy implementation, and strong adaptability.
We propose MeanCut as the objective function and greedily optimize it in degree descending order for a nondestructive graph partition.
The validity of our algorithm is demonstrated by testifying on real-world benchmarks and application of face recognition.
arXiv Detail & Related papers (2023-12-07T06:19:39Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [69.15976031704687]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.
IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink.
By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner.
The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss.
Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering.
We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure.
We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z) - Interpretable Clustering on Dynamic Graphs with Recurrent Graph Neural
Networks [24.017988997693262]
We study the problem of clustering nodes in a dynamic graph, where the connections between nodes and nodes' cluster memberships may change over time.
We first propose a simple decay-based clustering algorithm that clusters nodes based on weighted connections between them, where the weight decreases at a fixed rate over time.
We characterize the optimal decay rate for each cluster and propose a clustering method that achieves almost exact recovery of the true clusters.
arXiv Detail & Related papers (2020-12-16T04:31:19Z) - Improving k-Means Clustering Performance with Disentangled Internal
Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder.
Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z) - Clustering with Fast, Automated and Reproducible assessment applied to
longitudinal neural tracking [3.817161834189992]
C-FAR is a novel method for Fast, Automated and Reproducible assessment of hierarchical clustering algorithms simultaneously.
Our algorithm takes any number of hierarchical clustering trees as input, then strategically queries pairs for human feedback, and outputs an optimal clustering among those nominated by these trees.
Our flagship application is the cluster aggregation step in spike-sorting, the task of assigning waveforms (spikes) in recordings to neurons.
arXiv Detail & Related papers (2020-03-19T01:33:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.