An adaptive granularity clustering method based on hyper-ball
- URL: http://arxiv.org/abs/2205.14592v1
- Date: Sun, 29 May 2022 07:44:09 GMT
- Title: An adaptive granularity clustering method based on hyper-ball
- Authors: Shu-yin Xia, Jiang Xie, Guo-yin Wang
- Abstract summary: Our method is based on the idea that the data with similar distribution form a hyper-ball and the adjacent hyper-balls form a cluster.
Based on the cognitive law of "large scale first", this method can identify clusters without considering shape in a simple and non-parametric way.
- Score: 11.35322380857363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The purpose of cluster analysis is to classify elements according to their
similarity. Its applications range from astronomy to bioinformatics and pattern
recognition. Our method is based on the idea that the data with similar
distribution form a hyper-ball and the adjacent hyper-balls form a cluster.
Based on the cognitive law of "large scale first", this method can identify
clusters without considering shape in a simple and non-parametric way.
Experimental results on several datasets demonstrate the effectiveness of the
algorithm.
Related papers
- Clustering Based on Density Propagation and Subcluster Merging [92.15924057172195]
We propose a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space.
Unlike traditional density-based clustering methods, which necessitate calculating the distance between any two nodes, our proposed technique determines density through a propagation process.
arXiv Detail & Related papers (2024-11-04T04:09:36Z) - GBCT: An Efficient and Adaptive Granular-Ball Clustering Algorithm for Complex Data [49.56145012222276]
We propose a new clustering algorithm called granular-ball clustering (GBCT) via granular-ball computing.
GBCT forms clusters according to the relationship between granular-balls, instead of the traditional point relationship.
As granular-balls can fit various complex data, GBCT performs much better in non-spherical data sets than other traditional clustering methods.
arXiv Detail & Related papers (2024-10-17T07:32:05Z) - DECWA : Density-Based Clustering using Wasserstein Distance [1.4132765964347058]
We propose a new clustering algorithm based on spatial density and probabilistic approach.
We show that our approach outperforms other state-of-the-art density-based clustering methods on a wide variety of datasets.
arXiv Detail & Related papers (2023-10-25T11:10:08Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data.
In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data.
Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Spectral Clustering with Smooth Tiny Clusters [14.483043753721256]
We propose a novel clustering algorithm, which con-siders the smoothness of data for the first time.
Our key idea is to cluster tiny clusters, whose centers constitute smooth graphs.
Although in this paper, we singly focus on multi-scale situations, the idea of data smoothness can certainly be extended to any clustering algorithms.
arXiv Detail & Related papers (2020-09-10T05:21:20Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z) - Clustering by Constructing Hyper-Planes [0.0]
We present a clustering algorithm by finding hyper-planes to distinguish data points.
It relies on the marginal space between the points to determine centers and numbers of clusters.
Because the algorithm is based on linear structures, it can approximate the distribution of datasets accurately and flexibly.
arXiv Detail & Related papers (2020-04-25T08:52:21Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z) - Coarse-Grain Cluster Analysis of Tensors with Application to Climate
Biome Identification [0.27998963147546146]
We use the discrete wavelet transform to analyze the effects of coarse-graining on clustering tensor data.
We are particularly interested in understanding how scale effects clustering of the Earth's climate system.
arXiv Detail & Related papers (2020-01-22T00:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.