Related papers: Fast Density Estimation for Density-based Clustering Methods

Fast Density Estimation for Density-based Clustering Methods

URL: http://arxiv.org/abs/2109.11383v1
Date: Thu, 23 Sep 2021 13:59:42 GMT
Title: Fast Density Estimation for Density-based Clustering Methods
Authors: Difei Cheng, Ruihang Xu, Bo Zhang
Abstract summary: Density-based clustering algorithms are widely used for discovering clusters in pattern recognition and machine learning. The robustness of density-based algorithms is heavily dominated by finding neighbors and calculating the density of each point which is time-consuming. This paper proposes a density-based clustering framework by using the fast principal component analysis, which can be applied to density based methods to prune unnecessary distance calculations.
Score: 3.8972699157287702
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Density-based clustering algorithms are widely used for discovering clusters in pattern recognition and machine learning since they can deal with non-hyperspherical clusters and are robustness to handle outliers. However, the runtime of density-based algorithms is heavily dominated by finding neighbors and calculating the density of each point which is time-consuming. To address this issue, this paper proposes a density-based clustering framework by using the fast principal component analysis, which can be applied to density based methods to prune unnecessary distance calculations when finding neighbors and estimating densities. By applying this clustering framework to the Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, an improved DBSCAN (called IDBSCAN) is obtained, which preserves the advantage of DBSCAN and meanwhile, greatly reduces the computation of redundant distances. Experiments on five benchmark datasets demonstrate that the proposed IDBSCAN algorithm improves the computational efficiency significantly.

Related papers

Almost Linear Time Consistent Mode Estimation and Quick Shift Clustering [0.0]
We propose a method for density-based clustering in high-dimensional spaces that combines Locality-Sensitive Hashing (LSH) with the Quick Shift algorithm. The proposed approach achieves almost linear time complexity while preserving the consistency of density-based clustering.
arXiv Detail & Related papers (2025-03-11T02:51:31Z)
Scalable Density-based Clustering with Random Projections [9.028773906859541]
We present sDBSCAN, a scalable density-based clustering algorithm in high dimensions with cosine distance. Empirically, sDBSCAN is significantly faster and provides higher accuracy than many other clustering algorithms on real-world million-point data sets.
arXiv Detail & Related papers (2024-02-24T01:45:51Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
An Improved Probability Propagation Algorithm for Density Peak Clustering Based on Natural Nearest Neighborhood [0.0]
Clustering by fast search and find of density peaks (DPC) has been proven to be a promising clustering approach. This paper presents an improved probability propagation algorithm for density peak clustering based on the natural nearest neighborhood (DPC-PPNNN) In experiments on several datasets, DPC-PPNNN is shown to outperform DPC, K-means and DBSCAN.
arXiv Detail & Related papers (2022-07-04T03:36:57Z)
A density peaks clustering algorithm with sparse search and K-d tree [16.141611031128427]
Density peaks clustering algorithm with sparse search and K-d tree is developed to solve this problem. Experiments are carried out on datasets with different distribution characteristics, by comparing with other five typical clustering algorithms.
arXiv Detail & Related papers (2022-03-02T09:29:40Z)
Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms. We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z)
Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms. DPPs favor diversity of the center points within subsets. We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z)
(k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories. We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW) We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z)
Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems. Our results imply an improved algorithm for the Sample and Aggregate privacy framework. One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)
Non-Local Spatial Propagation Network for Depth Completion [82.60915972250706]
We propose a robust and efficient end-to-end non-local spatial propagation network for depth completion. The proposed network takes RGB and sparse depth images as inputs and estimates non-local neighbors and their affinities of each pixel. We show that the proposed algorithm is superior to conventional algorithms in terms of depth completion accuracy and robustness to the mixed-depth problem.
arXiv Detail & Related papers (2020-07-20T12:26:51Z)
SDCOR: Scalable Density-based Clustering for Local Outlier Detection in Massive-Scale Datasets [0.0]
This paper presents a batch-wise density-based clustering approach for local outlier detection in massive-scale datasets. Evaluations on real-life and synthetic datasets demonstrate that the proposed method has a low linear time complexity.
arXiv Detail & Related papers (2020-06-13T11:07:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.