Related papers: Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets

Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets

URL: http://arxiv.org/abs/2105.02810v1
Date: Fri, 30 Apr 2021 08:17:19 GMT
Title: Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets
Authors: Bryar A. Hassan, TarikA. Rashid, Seyedali Mirjalili
Abstract summary: This article presents the data used to evaluate the performance of evolutionary clustering algorithm star (ECA*) Two experimental methods are employed to examine the performance of ECA* against five traditional and modern clustering algorithms.
Score: 15.154538450706474
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This article presents the data used to evaluate the performance of evolutionary clustering algorithm star (ECA*) compared to five traditional and modern clustering algorithms. Two experimental methods are employed to examine the performance of ECA* against genetic algorithm for clustering++ (GENCLUST++), learning vector quantisation (LVQ) , expectation maximisation (EM) , K-means++ (KM++) and K-means (KM). These algorithms are applied to 32 heterogenous and multi-featured datasets to determine which one performs well on the three tests. For one, ther paper examines the efficiency of ECA* in contradiction of its corresponding algorithms using clustering evaluation measures. These validation criteria are objective function and cluster quality measures. For another, it suggests a performance rating framework to measurethe the performance sensitivity of these algorithms on varos dataset features (cluster dimensionality, number of clusters, cluster overlap, cluster shape and cluster structure). The contributions of these experiments are two-folds: (i) ECA* exceeds its counterpart aloriths in ability to find out the right cluster number; (ii) ECA* is less sensitive towards dataset features compared to its competitive techniques. Nonetheless, the results of the experiments performed demonstrate some limitations in the ECA*: (i) ECA* is not fully applied based on the premise that no prior knowledge exists; (ii) Adapting and utilising ECA* on several real applications has not been achieved yet.

Related papers

K*-Means: A Parameter-free Clustering Algorithm [55.20132267309382]
k*-means is a novel clustering algorithm that eliminates the need to set k or any other parameters.<n>It uses the minimum description length principle to automatically determine the optimal number of clusters, k*, by splitting and merging clusters.<n>We prove that k*-means is guaranteed to converge and demonstrate experimentally that it significantly outperforms existing methods in scenarios where k is unknown.
arXiv Detail & Related papers (2025-05-17T08:41:07Z)
Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient [0.5939858158928473]
This paper proposes an algorithm named k- SCC to estimate the optimal k in categorical data clustering. Comparative experiments were conducted on both synthetic and real datasets to compare the performance of k- SCC.
arXiv Detail & Related papers (2025-01-26T14:29:11Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
From A-to-Z Review of Clustering Validation Indices [4.08908337437878]
We review and evaluate the performance of internal and external clustering validation indices on the most common clustering algorithms. We suggest a classification framework for examining the functionality of both internal and external clustering validation measures.
arXiv Detail & Related papers (2024-07-18T13:52:02Z)
A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm. In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z)
GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure. First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples. Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z)
Fuzzy K-Means Clustering without Cluster Centroids [21.256564324236333]
Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. This paper proposes a novel Fuzzy textitK-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
arXiv Detail & Related papers (2024-04-07T12:25:03Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
A Novel Cluster Detection of COVID-19 Patients and Medical Disease Conditions Using Improved Evolutionary Clustering Algorithm Star [0.9990687944474739]
We improve the current evolutionary clustering algorithm star (ECA*), called iECA*, in three manners. Experiments were conducted to examine the performance of iECA* against state-of-the-art algorithms.
arXiv Detail & Related papers (2021-09-20T12:47:09Z)
HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis [2.5329716878122404]
Comprehensive benchmarking of clustering algorithms is difficult. There is no consensus regarding the best practice for rigorous benchmarking. We demonstrate the important role evolutionary algorithms play to support flexible generation of such benchmarks.
arXiv Detail & Related papers (2021-02-13T15:01:34Z)
A Multi-disciplinary Ensemble Algorithm for Clustering Heterogeneous Datasets [0.76146285961466]
We propose a new evolutionary clustering algorithm (ECAStar) based on social class ranking and meta-heuristic algorithms. ECAStar is integrated with recombinational evolutionary operators, Levy flight optimisation, and some statistical techniques. Experiments are conducted to evaluate the ECAStar against five conventional approaches.
arXiv Detail & Related papers (2021-01-01T07:20:50Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
Clustering Binary Data by Application of Combinatorial Optimization Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.