Fuzzy K-Means Clustering without Cluster Centroids
- URL: http://arxiv.org/abs/2404.04940v2
- Date: Thu, 07 Nov 2024 08:59:23 GMT
- Title: Fuzzy K-Means Clustering without Cluster Centroids
- Authors: Yichen Bao, Han Lu, Quanxue Gao,
- Abstract summary: Fuzzy K-Means clustering is a critical technique in unsupervised data analysis.
This paper proposes a novel Fuzzy textitK-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
- Score: 21.256564324236333
- License:
- Abstract: Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. Unlike traditional hard clustering algorithms such as K-Means, it allows data points to belong to multiple clusters with varying degrees of membership, determined through iterative optimization to establish optimal cluster centers and memberships, thereby achieving fuzzy partitioning of data. However, the performance of popular Fuzzy K-Means algorithms is sensitive to the selection of initial cluster centroids and is also affected by noise when updating mean cluster centroids. To address these challenges, this paper proposes a novel Fuzzy \textit{K}-Means clustering algorithm that entirely eliminates the reliance on cluster centroids, obtaining membership metrics solely through distance matrix computation. This innovation enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness. The paper also establishes theoretical connections between the proposed model and popular Fuzzy K-Means clustering techniques. Experimental results on several real datasets demonstrate the effectiveness of the algorithm.
Related papers
- Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient [0.5939858158928473]
This paper proposes an algorithm named k- SCC to estimate the optimal k in categorical data clustering.
Comparative experiments were conducted on both synthetic and real datasets to compare the performance of k- SCC.
arXiv Detail & Related papers (2025-01-26T14:29:11Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Dirichlet Process-based Robust Clustering using the Median-of-Means Estimator [16.774378814288806]
We propose an efficient and automatic clustering technique by integrating the strengths of model-based and centroid-based methodologies.
Our method mitigates the effect of noise on the quality of clustering; while at the same time, estimates the number of clusters.
arXiv Detail & Related papers (2023-11-26T19:01:15Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments [0.0]
We propose a hybrid fuzzy-crisp clustering algorithm based on a target function combining linear and quadratic terms of the membership function.
In this algorithm, the membership of a data point to a cluster is automatically set to exactly zero if the data point is sufficiently'' far from the cluster center.
The proposed algorithm is demonstrated to outperform the conventional methods on imbalanced datasets and can be competitive on more balanced datasets.
arXiv Detail & Related papers (2023-03-25T05:27:26Z) - An enhanced method of initial cluster center selection for K-means
algorithm [0.0]
We propose a novel approach to improve initial cluster selection for K-means algorithm.
The Convex Hull algorithm facilitates the computing of the first two centroids and the remaining ones are selected according to the distance from previously selected centers.
We obtained only 7.33%, 7.90%, and 0% clustering error in Iris, Letter, and Ruspini data respectively.
arXiv Detail & Related papers (2022-10-18T00:58:50Z) - Careful Seeding for k-Medois Clustering with Incremental k-Means++ Initialization [17.4921582710817]
K-medoids clustering is a popular variant of k-means clustering and widely used in pattern recognition and machine learning.
An improved k-medoids clustering algorithm, called INCKM algorithm, was recently proposed to overcome this drawback.
We propose a novel k-medoids clustering algorithm, called incremental k-means++ (INCKPP) algorithm, which initializes with a novel incremental manner.
arXiv Detail & Related papers (2022-07-06T02:25:35Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - Robust Trimmed k-means [70.88503833248159]
We propose Robust Trimmed k-means (RTKM) that simultaneously identifies outliers and clusters points.
We show RTKM performs competitively with other methods on single membership data with outliers and multi-membership data without outliers.
arXiv Detail & Related papers (2021-08-16T15:49:40Z) - Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms.
DPPs favor diversity of the center points within subsets.
We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.