Related papers: K-GBS3FCM -- KNN Graph-Based Safe Semi-Supervised Fuzzy C-Means

K-GBS3FCM -- KNN Graph-Based Safe Semi-Supervised Fuzzy C-Means

URL: http://arxiv.org/abs/2411.14728v1
Date: Fri, 22 Nov 2024 04:48:58 GMT
Title: K-GBS3FCM -- KNN Graph-Based Safe Semi-Supervised Fuzzy C-Means
Authors: Gabriel Santos, Rita Julia, Marcelo Nascimento,
Abstract summary: This paper introduces the KNN graph-based safety-aware semi-supervised fuzzy c-means algorithm (K-GBS3FCM) It dynamically assesses neighborhood relationships between labeled and unlabeled data using the K-Nearest Neighbors (KNN) algorithm. It is proposed a mechanism that adjusts the influence of labeled data on unlabeled ones through regularization parameters and the average safety degree.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Clustering data using prior domain knowledge, starting from a partially labeled set, has recently been widely investigated. Often referred to as semi-supervised clustering, this approach leverages labeled data to enhance clustering accuracy. To maximize algorithm performance, it is crucial to ensure the safety of this prior knowledge. Methods addressing this concern are termed safe semi-supervised clustering (S3C) algorithms. This paper introduces the KNN graph-based safety-aware semi-supervised fuzzy c-means algorithm (K-GBS3FCM), which dynamically assesses neighborhood relationships between labeled and unlabeled data using the K-Nearest Neighbors (KNN) algorithm. This approach aims to optimize the use of labeled data while minimizing the adverse effects of incorrect labels. Additionally, it is proposed a mechanism that adjusts the influence of labeled data on unlabeled ones through regularization parameters and the average safety degree. Experimental results on multiple benchmark datasets demonstrate that the graph-based approach effectively leverages prior knowledge to enhance clustering accuracy. The proposed method was significantly superior in 64% of the 56 test configurations, obtaining higher levels of clustering accuracy when compared to other semi-supervised and traditional unsupervised methods. This research highlights the potential of integrating graph-based approaches, such as KNN, with established techniques to develop advanced clustering algorithms, offering significant applications in fields that rely on both labeled and unlabeled data for more effective clustering.

Related papers

K*-Means: A Parameter-free Clustering Algorithm [55.20132267309382]
k*-means is a novel clustering algorithm that eliminates the need to set k or any other parameters.<n>It uses the minimum description length principle to automatically determine the optimal number of clusters, k*, by splitting and merging clusters.<n>We prove that k*-means is guaranteed to converge and demonstrate experimentally that it significantly outperforms existing methods in scenarios where k is unknown.
arXiv Detail & Related papers (2025-05-17T08:41:07Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Outlier Detection with Cluster Catch Digraphs [1.6440434996206625]
This paper introduces a novel family of outlier detection algorithms based on Cluster Catch Digraphs (CCDs) We propose the Uniformity-Based CCD with Mutual Catch Graph (U-MCCD), the Uniformity- and Neighbor-Based CCD with Mutual Catch Graph (UN-MCCD), and their shape-adaptive variants (SU-MCCD and SUN-MCCD) Our results indicate that these novel algorithms offer substantial advancements in the accuracy and adaptability of outlier detection.
arXiv Detail & Related papers (2024-09-17T23:08:53Z)
Fuzzy K-Means Clustering without Cluster Centroids [21.256564324236333]
Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. This paper proposes a novel Fuzzy textitK-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
arXiv Detail & Related papers (2024-04-07T12:25:03Z)
DBGSA: A Novel Data Adaptive Bregman Clustering Algorithm [2.0232038310495435]
We present a clustering algorithm that is highly sensitive to the initial selection and robustness of datasets. Extensive experiments are conducted on four simulated datasets six real datasets. Results demonstrate that our algorithm improves the accuracy of various algorithms by an average of 63.8%.
arXiv Detail & Related papers (2023-07-25T16:37:09Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Influence of Swarm Intelligence in Data Clustering Mechanisms [0.0]
Nature inspired Swarm based algorithms are used for data clustering to cope with larger datasets with lack and inconsistency of data. This paper reviews the performances of these new approaches and compares which is best for certain problematic situation.
arXiv Detail & Related papers (2023-05-07T08:40:50Z)
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling. This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data. We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z)
Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms. The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm. As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z)
Predictive K-means with local models [0.028675177318965035]
Predictive clustering seeks to obtain the best of the two worlds. We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance.
arXiv Detail & Related papers (2020-12-16T10:49:36Z)
Bayesian Optimization with Machine Learning Algorithms Towards Anomaly Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique. The performance of the considered algorithms is evaluated using the ISCX 2012 dataset. Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z)
Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning. We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.