Clustering Based on Graph of Density Topology
- URL: http://arxiv.org/abs/2009.11612v1
- Date: Thu, 24 Sep 2020 11:40:24 GMT
- Title: Clustering Based on Graph of Density Topology
- Authors: Zhangyang Gao, Haitao Lin, Stan. Z Li
- Abstract summary: We propose a novel clustering algorithm based on what we call graph of density topology (GDT)
GDT achieves the SOTA performance by far on almost all the popular datasets, and has a low time complexity of O(nlogn)
- Score: 41.57592288113871
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data clustering with uneven distribution in high level noise is challenging.
Currently, HDBSCAN is considered as the SOTA algorithm for this problem. In
this paper, we propose a novel clustering algorithm based on what we call graph
of density topology (GDT). GDT jointly considers the local and global
structures of data samples: firstly forming local clusters based on a density
growing process with a strategy for properly noise handling as well as cluster
boundary detection; and then estimating a GDT from relationship between local
clusters in terms of a connectivity measure, givingglobal topological graph.
The connectivity, measuring similarity between neighboring local clusters, is
based on local clusters rather than individual points, ensuring its robustness
to even very large noise. Evaluation results on both toy and real-world
datasets show that GDT achieves the SOTA performance by far on almost all the
popular datasets, and has a low time complexity of O(nlogn). The code is
available at https://github.com/gaozhangyang/DGC.git.
Related papers
- A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - DenMune: Density peak based clustering using mutual nearest neighbors [0.0]
Many clustering algorithms fail when clusters are of arbitrary shapes, of varying densities, or the data classes are unbalanced and close to each other.
A novel clustering algorithm, DenMune is presented to meet this challenge.
It is based on identifying dense regions using mutual nearest neighborhoods of size K, where K is the only parameter required from the user.
arXiv Detail & Related papers (2023-09-23T16:18:00Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink.
By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner.
The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss.
Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z) - Graph Representation Learning via Contrasting Cluster Assignments [57.87743170674533]
We propose a novel unsupervised graph representation model by contrasting cluster assignments, called as GRCCA.
It is motivated to make good use of local and global information synthetically through combining clustering algorithms and contrastive learning.
GRCCA has strong competitiveness in most tasks.
arXiv Detail & Related papers (2021-12-15T07:28:58Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Git: Clustering Based on Graph of Intensity Topology [25.679620842010422]
We propose a novel clustering algorithm, namely GIT (Clustering Based on textbfGraph of textbfIntensity textbfTopology)
With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance.
arXiv Detail & Related papers (2021-10-04T09:29:43Z) - Swarm Intelligence for Self-Organized Clustering [6.85316573653194]
A swarm system called Databionic swarm (DBS) is introduced which is able to adapt itself to structures of high-dimensional data.
By exploiting the interrelations of swarm intelligence, self-organization and emergence, DBS serves as an alternative approach to the optimization of a global objective function in the task of clustering.
arXiv Detail & Related papers (2021-06-10T06:21:48Z) - Incorporating User's Preference into Attributed Graph Clustering [14.082520165369885]
We propose two quality measures for a local cluster: Graph Unimodality (GU) and Attribute Unimodality (AU)
The local cluster detected by LOCLU concentrates on the region of interest, provides efficient information flow in the graph and exhibits a unimodal data distribution in the subspace of the designated attributes.
arXiv Detail & Related papers (2020-03-24T19:07:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.