SMLSOM: The shrinking maximum likelihood self-organizing map
- URL: http://arxiv.org/abs/2104.13971v1
- Date: Wed, 28 Apr 2021 18:50:36 GMT
- Title: SMLSOM: The shrinking maximum likelihood self-organizing map
- Authors: Ryosuke Motegi and Yoichi Seki
- Abstract summary: This paper proposes a greedy algorithm that automatically selects a suitable number of clusters based on a probability distribution model framework.
Compared with existing methods, our proposed method is computationally efficient and can accurately select the number of clusters.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining the number of clusters in a dataset is a fundamental issue in
data clustering. Many methods have been proposed to solve the problem of
selecting the number of clusters, considering it to be a problem with regard to
model selection. This paper proposes a greedy algorithm that automatically
selects a suitable number of clusters based on a probability distribution model
framework. The algorithm includes two components. First, a generalization of
Kohonen's self-organizing map (SOM), which has nodes linked to a probability
distribution model, and which enables the algorithm to search for the winner
based on the likelihood of each node, is introduced. Second, the proposed
method uses a graph structure and a neighbor defined by the length of the
shortest path between nodes, in contrast to Kohonen's SOM in which the nodes
are fixed in the Euclidean space. This implementation makes it possible to
update its graph structure by cutting links to weakly connected nodes to avoid
unnecessary node deletion. The weakness of a node connection is measured using
the Kullback--Leibler divergence and the redundancy of a node is measured by
the minimum description length (MDL). This updating step makes it easy to
determine the suitable number of clusters. Compared with existing methods, our
proposed method is computationally efficient and can accurately select the
number of clusters and perform clustering.
Related papers
- Clustering Based on Density Propagation and Subcluster Merging [92.15924057172195]
We propose a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space.
Unlike traditional density-based clustering methods, which necessitate calculating the distance between any two nodes, our proposed technique determines density through a propagation process.
arXiv Detail & Related papers (2024-11-04T04:09:36Z) - MeanCut: A Greedy-Optimized Graph Clustering via Path-based Similarity
and Degree Descent Criterion [0.6906005491572401]
spectral clustering is popular and attractive due to the remarkable performance, easy implementation, and strong adaptability.
We propose MeanCut as the objective function and greedily optimize it in degree descending order for a nondestructive graph partition.
The validity of our algorithm is demonstrated by testifying on real-world benchmarks and application of face recognition.
arXiv Detail & Related papers (2023-12-07T06:19:39Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink.
By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner.
The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss.
Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z) - Hybridization of K-means with improved firefly algorithm for automatic
clustering in high dimension [0.0]
We have implemented the Silhouette and Elbow methods with PCA to find an optimal number of clusters.
In the Firefly algorithm, the entire population is automatically subdivided into sub-populations that decrease the convergence rate speed and trapping to local minima.
Our study proposed an enhanced firefly, i.e., a hybridized K-means with an ODFA model for automatic clustering.
arXiv Detail & Related papers (2023-02-09T18:43:10Z) - ck-means, a novel unsupervised learning method that combines fuzzy and
crispy clustering methods to extract intersecting data [1.827510863075184]
This paper proposes a method to cluster data that share the same intersections between two features or more.
The main idea of this novel method is to generate fuzzy clusters of data using a Fuzzy C-Means (FCM) algorithm.
The algorithm is also able to find the optimal number of clusters for the FCM and the k-means algorithm, according to the consistency of the clusters given by the Silhouette Index (SI)
arXiv Detail & Related papers (2022-06-17T19:29:50Z) - Recovering Unbalanced Communities in the Stochastic Block Model With
Application to Clustering with a Faulty Oracle [9.578056676899203]
oracle block model (SBM) is a fundamental model for studying graph clustering or community detection in networks.
We provide a simple SVD-based algorithm for recovering the communities in the SBM with communities of varying sizes.
arXiv Detail & Related papers (2022-02-17T08:51:19Z) - Local Graph Clustering with Network Lasso [90.66817876491052]
We study the statistical and computational properties of a network Lasso method for local graph clustering.
The clusters delivered by nLasso can be characterized elegantly via network flows between cluster boundary and seed nodes.
arXiv Detail & Related papers (2020-04-25T17:52:05Z) - Probabilistic Partitive Partitioning (PPP) [0.0]
Clustering algorithms, in general, face two common problems.
They converge to different settings with different initial conditions.
The number of clusters has to be arbitrarily decided beforehand.
arXiv Detail & Related papers (2020-03-09T19:18:35Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.