An Improved Probability Propagation Algorithm for Density Peak
Clustering Based on Natural Nearest Neighborhood
- URL: http://arxiv.org/abs/2207.01178v1
- Date: Mon, 4 Jul 2022 03:36:57 GMT
- Title: An Improved Probability Propagation Algorithm for Density Peak
Clustering Based on Natural Nearest Neighborhood
- Authors: Wendi Zuo, Xinmin Hou
- Abstract summary: Clustering by fast search and find of density peaks (DPC) has been proven to be a promising clustering approach.
This paper presents an improved probability propagation algorithm for density peak clustering based on the natural nearest neighborhood (DPC-PPNNN)
In experiments on several datasets, DPC-PPNNN is shown to outperform DPC, K-means and DBSCAN.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering by fast search and find of density peaks (DPC) (Since, 2014) has
been proven to be a promising clustering approach that efficiently discovers
the centers of clusters by finding the density peaks. The accuracy of DPC
depends on the cutoff distance ($d_c$), the cluster number ($k$) and the
selection of the centers of clusters. Moreover, the final allocation strategy
is sensitive and has poor fault tolerance. The shortcomings above make the
algorithm sensitive to parameters and only applicable for some specific
datasets. To overcome the limitations of DPC, this paper presents an improved
probability propagation algorithm for density peak clustering based on the
natural nearest neighborhood (DPC-PPNNN). By introducing the idea of natural
nearest neighborhood and probability propagation, DPC-PPNNN realizes the
nonparametric clustering process and makes the algorithm applicable for more
complex datasets. In experiments on several datasets, DPC-PPNNN is shown to
outperform DPC, K-means and DBSCAN.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Scalable Density-based Clustering with Random Projections [9.028773906859541]
We present sDBSCAN, a scalable density-based clustering algorithm in high dimensions with cosine distance.
Empirically, sDBSCAN is significantly faster and provides higher accuracy than many other clustering algorithms on real-world million-point data sets.
arXiv Detail & Related papers (2024-02-24T01:45:51Z) - DenMune: Density peak based clustering using mutual nearest neighbors [0.0]
Many clustering algorithms fail when clusters are of arbitrary shapes, of varying densities, or the data classes are unbalanced and close to each other.
A novel clustering algorithm, DenMune is presented to meet this challenge.
It is based on identifying dense regions using mutual nearest neighborhoods of size K, where K is the only parameter required from the user.
arXiv Detail & Related papers (2023-09-23T16:18:00Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - A density peaks clustering algorithm with sparse search and K-d tree [16.141611031128427]
Density peaks clustering algorithm with sparse search and K-d tree is developed to solve this problem.
Experiments are carried out on datasets with different distribution characteristics, by comparing with other five typical clustering algorithms.
arXiv Detail & Related papers (2022-03-02T09:29:40Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - VDPC: Variational Density Peak Clustering Algorithm [16.20037014662979]
We propose a variational density peak clustering (VDPC) algorithm to identify clusters with variational density.
VDPC outperforms two classical algorithms (i.e., DPC and DBSCAN) and four state-of-the-art extended DPC algorithms.
arXiv Detail & Related papers (2021-12-29T12:50:09Z) - Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms.
We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z) - Fast Density Estimation for Density-based Clustering Methods [3.8972699157287702]
Density-based clustering algorithms are widely used for discovering clusters in pattern recognition and machine learning.
The robustness of density-based algorithms is heavily dominated by finding neighbors and calculating the density of each point which is time-consuming.
This paper proposes a density-based clustering framework by using the fast principal component analysis, which can be applied to density based methods to prune unnecessary distance calculations.
arXiv Detail & Related papers (2021-09-23T13:59:42Z) - Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms.
DPPs favor diversity of the center points within subsets.
We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z) - Improving Generative Adversarial Networks with Local Coordinate Coding [150.24880482480455]
Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution.
In practice, semantic information might be represented by some latent distribution learned from data.
We propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
arXiv Detail & Related papers (2020-07-28T09:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.