Related papers: Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments

Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments

URL: http://arxiv.org/abs/2303.14366v1
Date: Sat, 25 Mar 2023 05:27:26 GMT
Title: Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments
Authors: Akira R. Kinjo and Daphne Teck Ching Lai
Abstract summary: We propose a hybrid fuzzy-crisp clustering algorithm based on a target function combining linear and quadratic terms of the membership function. In this algorithm, the membership of a data point to a cluster is automatically set to exactly zero if the data point is sufficiently'' far from the cluster center. The proposed algorithm is demonstrated to outperform the conventional methods on imbalanced datasets and can be competitive on more balanced datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the membership function being strictly positive, the conventional fuzzy c-means clustering method sometimes causes imbalanced influence when clusters of vastly different sizes exist. That is, an outstandingly large cluster drags to its center all the other clusters, however far they are separated. To solve this problem, we propose a hybrid fuzzy-crisp clustering algorithm based on a target function combining linear and quadratic terms of the membership function. In this algorithm, the membership of a data point to a cluster is automatically set to exactly zero if the data point is ``sufficiently'' far from the cluster center. In this paper, we present a new algorithm for hybrid fuzzy-crisp clustering along with its geometric interpretation. The algorithm is tested on twenty simulated data generated and five real-world datasets from the UCI repository and compared with conventional fuzzy and crisp clustering methods. The proposed algorithm is demonstrated to outperform the conventional methods on imbalanced datasets and can be competitive on more balanced datasets.

Related papers

K*-Means: A Parameter-free Clustering Algorithm [55.20132267309382]
k*-means is a novel clustering algorithm that eliminates the need to set k or any other parameters.<n>It uses the minimum description length principle to automatically determine the optimal number of clusters, k*, by splitting and merging clusters.<n>We prove that k*-means is guaranteed to converge and demonstrate experimentally that it significantly outperforms existing methods in scenarios where k is unknown.
arXiv Detail & Related papers (2025-05-17T08:41:07Z)
GBCT: An Efficient and Adaptive Granular-Ball Clustering Algorithm for Complex Data [49.56145012222276]
We propose a new clustering algorithm called granular-ball clustering (GBCT) via granular-ball computing. GBCT forms clusters according to the relationship between granular-balls, instead of the traditional point relationship. As granular-balls can fit various complex data, GBCT performs much better in non-spherical data sets than other traditional clustering methods.
arXiv Detail & Related papers (2024-10-17T07:32:05Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Fuzzy K-Means Clustering without Cluster Centroids [21.256564324236333]
Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. This paper proposes a novel Fuzzy textitK-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
arXiv Detail & Related papers (2024-04-07T12:25:03Z)
Linear time Evidence Accumulation Clustering with KMeans [0.0]
This work describes a trick which mimic the behavior of average linkage clustering. We found a way of computing efficiently the density of a partitioning, reducing the cost from a quadratic to linear complexity. The k-means results are comparable to the best state of the art in terms of NMI while keeping the computational cost low.
arXiv Detail & Related papers (2023-11-15T14:12:59Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Hybridization of K-means with improved firefly algorithm for automatic clustering in high dimension [0.0]
We have implemented the Silhouette and Elbow methods with PCA to find an optimal number of clusters. In the Firefly algorithm, the entire population is automatically subdivided into sub-populations that decrease the convergence rate speed and trapping to local minima. Our study proposed an enhanced firefly, i.e., a hybridized K-means with an ODFA model for automatic clustering.
arXiv Detail & Related papers (2023-02-09T18:43:10Z)
ck-means, a novel unsupervised learning method that combines fuzzy and crispy clustering methods to extract intersecting data [1.827510863075184]
This paper proposes a method to cluster data that share the same intersections between two features or more. The main idea of this novel method is to generate fuzzy clusters of data using a Fuzzy C-Means (FCM) algorithm. The algorithm is also able to find the optimal number of clusters for the FCM and the k-means algorithm, according to the consistency of the clusters given by the Silhouette Index (SI)
arXiv Detail & Related papers (2022-06-17T19:29:50Z)
Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms. DPPs favor diversity of the center points within subsets. We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset. Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation. We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)
Probabilistic Partitive Partitioning (PPP) [0.0]
Clustering algorithms, in general, face two common problems. They converge to different settings with different initial conditions. The number of clusters has to be arbitrarily decided beforehand.
arXiv Detail & Related papers (2020-03-09T19:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.