Consistency between ordering and clustering methods for graphs
- URL: http://arxiv.org/abs/2208.12933v2
- Date: Fri, 7 Apr 2023 16:02:19 GMT
- Title: Consistency between ordering and clustering methods for graphs
- Authors: Tatsuro Kawamoto, Masaki Ochi, Teruyoshi Kobayashi
- Abstract summary: We investigate methodological relationships between several clustering and ordering methods.
We propose a measure called the label continuity error, which generically quantifies the degree of consistency between a sequence and partition.
Based on synthetic and real-world datasets, we evaluate the extents to which an ordering method identifies a module structure.
- Score: 0.8594140167290096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A relational dataset is often analyzed by optimally assigning a label to each
element through clustering or ordering. While similar characterizations of a
dataset would be achieved by both clustering and ordering methods, the former
has been studied much more actively than the latter, particularly for the data
represented as graphs. This study fills this gap by investigating
methodological relationships between several clustering and ordering methods,
focusing on spectral techniques. Furthermore, we evaluate the resulting
performance of the clustering and ordering methods. To this end, we propose a
measure called the label continuity error, which generically quantifies the
degree of consistency between a sequence and partition for a set of elements.
Based on synthetic and real-world datasets, we evaluate the extents to which an
ordering method identifies a module structure and a clustering method
identifies a banded structure.
Related papers
- Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Algorithm-Agnostic Interpretations for Clustering [0.0]
We propose algorithm-agnostic interpretation methods to explain clustering outcomes in reduced dimensions.
The permutation feature importance for clustering represents a general framework based on shuffling feature values.
All methods can be used with any clustering algorithm able to reassign instances through soft or hard labels.
arXiv Detail & Related papers (2022-09-21T18:08:40Z) - Data Clustering as an Emergent Consensus of Autonomous Agents [0.0]
We present a data segmentation method based on a first-order density-induced consensus protocol.
We provide a mathematically rigorous analysis of the consensus model leading to the stopping criteria of the data segmentation.
arXiv Detail & Related papers (2022-04-22T09:11:35Z) - Natural Hierarchical Cluster Analysis by Nearest Neighbors with
Near-Linear Time Complexity [0.0]
We propose a nearest neighbor based clustering algorithm that results in a naturally defined hierarchy of clusters.
In contrast to the agglomerative and divisive hierarchical clustering algorithms, our approach is not dependent on the iterative working of the algorithm.
arXiv Detail & Related papers (2022-03-15T16:03:42Z) - Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one.
We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective.
Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Order preserving hierarchical agglomerative clustering [0.0]
We study the problem of order preserving hierarchical clustering of this kind of ordered data.
The clustering is similarity based and uses standard linkage functions, such as single- and complete linkage.
An optimal hierarchical clustering is defined as the partial dendrogram corresponding to the ultrametric closest to the original dissimilarity measure.
arXiv Detail & Related papers (2020-04-26T21:58:53Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.