VDPC: Variational Density Peak Clustering Algorithm
- URL: http://arxiv.org/abs/2201.00641v1
- Date: Wed, 29 Dec 2021 12:50:09 GMT
- Title: VDPC: Variational Density Peak Clustering Algorithm
- Authors: Yizhang Wang, Di Wang, You Zhou, Xiaofeng Zhang, Chai Quek
- Abstract summary: We propose a variational density peak clustering (VDPC) algorithm to identify clusters with variational density.
VDPC outperforms two classical algorithms (i.e., DPC and DBSCAN) and four state-of-the-art extended DPC algorithms.
- Score: 16.20037014662979
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The widely applied density peak clustering (DPC) algorithm makes an intuitive
cluster formation assumption that cluster centers are often surrounded by data
points with lower local density and far away from other data points with higher
local density. However, this assumption suffers from one limitation that it is
often problematic when identifying clusters with lower density because they
might be easily merged into other clusters with higher density. As a result,
DPC may not be able to identify clusters with variational density. To address
this issue, we propose a variational density peak clustering (VDPC) algorithm,
which is designed to systematically and autonomously perform the clustering
task on datasets with various types of density distributions. Specifically, we
first propose a novel method to identify the representatives among all data
points and construct initial clusters based on the identified representatives
for further analysis of the clusters' property. Furthermore, we divide all data
points into different levels according to their local density and propose a
unified clustering framework by combining the advantages of both DPC and
DBSCAN. Thus, all the identified initial clusters spreading across different
density levels are systematically processed to form the final clusters. To
evaluate the effectiveness of the proposed VDPC algorithm, we conduct extensive
experiments using 20 datasets including eight synthetic, six real-world and six
image datasets. The experimental results show that VDPC outperforms two
classical algorithms (i.e., DPC and DBSCAN) and four state-of-the-art extended
DPC algorithms.
Related papers
- Clustering Based on Density Propagation and Subcluster Merging [92.15924057172195]
We propose a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space.
Unlike traditional density-based clustering methods, which necessitate calculating the distance between any two nodes, our proposed technique determines density through a propagation process.
arXiv Detail & Related papers (2024-11-04T04:09:36Z) - SHADE: Deep Density-based Clustering [13.629470968274]
SHADE is the first deep clustering algorithm that incorporates density-connectivity into its loss function.
It supports high-dimensional and large data sets with the expressive power of a deep autoencoder.
It outperforms existing methods in clustering quality, especially on data that contain non-Gaussian clusters.
arXiv Detail & Related papers (2024-10-08T18:03:35Z) - DECWA : Density-Based Clustering using Wasserstein Distance [1.4132765964347058]
We propose a new clustering algorithm based on spatial density and probabilistic approach.
We show that our approach outperforms other state-of-the-art density-based clustering methods on a wide variety of datasets.
arXiv Detail & Related papers (2023-10-25T11:10:08Z) - GFDC: A Granule Fusion Density-Based Clustering with Evidential
Reasoning [22.526274021556755]
density-based clustering algorithms are widely applied because they can detect clusters with arbitrary shapes.
This paper proposes a granule fusion density-based clustering with evidential reasoning (GFDC)
Both local and global densities of samples are measured by a sparse degree metric first.
Then information granules are generated in high-density and low-density regions, assisting in processing clusters with significant density differences.
arXiv Detail & Related papers (2023-05-20T06:27:31Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - An Improved Probability Propagation Algorithm for Density Peak
Clustering Based on Natural Nearest Neighborhood [0.0]
Clustering by fast search and find of density peaks (DPC) has been proven to be a promising clustering approach.
This paper presents an improved probability propagation algorithm for density peak clustering based on the natural nearest neighborhood (DPC-PPNNN)
In experiments on several datasets, DPC-PPNNN is shown to outperform DPC, K-means and DBSCAN.
arXiv Detail & Related papers (2022-07-04T03:36:57Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms.
We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z) - Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms.
DPPs favor diversity of the center points within subsets.
We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.