Git: Clustering Based on Graph of Intensity Topology
- URL: http://arxiv.org/abs/2110.01274v1
- Date: Mon, 4 Oct 2021 09:29:43 GMT
- Title: Git: Clustering Based on Graph of Intensity Topology
- Authors: Zhangyang Gao, Haitao Lin, Cheng Tan, Lirong Wu, Stan. Z Li
- Abstract summary: We propose a novel clustering algorithm, namely GIT (Clustering Based on textbfGraph of textbfIntensity textbfTopology)
With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance.
- Score: 25.679620842010422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: \textbf{A}ccuracy, \textbf{R}obustness to noises and scales,
\textbf{I}nterpretability, \textbf{S}peed, and \textbf{E}asy to use (ARISE) are
crucial requirements of a good clustering algorithm. However, achieving these
goals simultaneously is challenging, and most advanced approaches only focus on
parts of them. Towards an overall consideration of these aspects, we propose a
novel clustering algorithm, namely GIT (Clustering Based on \textbf{G}raph of
\textbf{I}ntensity \textbf{T}opology). GIT considers both local and global data
structures: firstly forming local clusters based on intensity peaks of samples,
and then estimating the global topological graph (topo-graph) between these
local clusters. We use the Wasserstein Distance between the predicted and prior
class proportions to automatically cut noisy edges in the topo-graph and merge
connected local clusters as final clusters. Then, we compare GIT with seven
competing algorithms on five synthetic datasets and nine real-world datasets.
With fast local cluster detection, robust topo-graph construction and accurate
edge-cutting, GIT shows attractive ARISE performance and significantly exceeds
other non-convex clustering methods. For example, GIT outperforms its
counterparts about $10\%$ (F1-score) on MNIST and FashionMNIST. Code is
available at \color{red}{https://github.com/gaozhangyang/GIT}.
Related papers
- A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - Learning Uniform Clusters on Hypersphere for Deep Graph-level Clustering [25.350054742471816]
We propose a novel deep graph-level clustering method called Uniform Deep Graph Clustering (UDGC)
UDGC assigns instances evenly to different clusters and then scatters those clusters on unit hypersphere, leading to a more uniform cluster-level distribution and a slighter cluster collapse.
Our empirical study on eight well-known datasets demonstrates that UDGC significantly outperforms the state-of-the-art models.
arXiv Detail & Related papers (2023-11-23T12:08:20Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink.
By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner.
The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss.
Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z) - GLCC: A General Framework for Graph-level Clustering [5.069852282550117]
This paper studies the problem of graph-level clustering, which is a novel yet challenging task.
We propose a general graph-level clustering framework named Graph-Level Contrastive Clustering (GLCC)
Experiments on a range of well-known datasets demonstrate the superiority of our proposed GLCC over competitive baselines.
arXiv Detail & Related papers (2022-10-21T11:08:10Z) - Graph Representation Learning via Contrasting Cluster Assignments [57.87743170674533]
We propose a novel unsupervised graph representation model by contrasting cluster assignments, called as GRCCA.
It is motivated to make good use of local and global information synthetically through combining clustering algorithms and contrastive learning.
GRCCA has strong competitiveness in most tasks.
arXiv Detail & Related papers (2021-12-15T07:28:58Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Clustering Based on Graph of Density Topology [41.57592288113871]
We propose a novel clustering algorithm based on what we call graph of density topology (GDT)
GDT achieves the SOTA performance by far on almost all the popular datasets, and has a low time complexity of O(nlogn)
arXiv Detail & Related papers (2020-09-24T11:40:24Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Incorporating User's Preference into Attributed Graph Clustering [14.082520165369885]
We propose two quality measures for a local cluster: Graph Unimodality (GU) and Attribute Unimodality (AU)
The local cluster detected by LOCLU concentrates on the region of interest, provides efficient information flow in the graph and exhibits a unimodal data distribution in the subspace of the designated attributes.
arXiv Detail & Related papers (2020-03-24T19:07:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.