Neural Mixture Models with Expectation-Maximization for End-to-end Deep
Clustering
- URL: http://arxiv.org/abs/2107.02453v1
- Date: Tue, 6 Jul 2021 08:00:58 GMT
- Title: Neural Mixture Models with Expectation-Maximization for End-to-end Deep
Clustering
- Authors: Dumindu Tissera, Kasun Vithanage, Rukshan Wijesinghe, Alex Xavier,
Sanath Jayasena, Subha Fernando, Ranga Rodrigo
- Abstract summary: In this paper, we realize mixture model-based clustering with a neural network.
We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step.
Our trained networks outperform single-stage deep clustering methods that still depend on k-means.
- Score: 0.8543753708890495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Any clustering algorithm must synchronously learn to model the clusters and
allocate data to those clusters in the absence of labels. Mixture model-based
methods model clusters with pre-defined statistical distributions and allocate
data to those clusters based on the cluster likelihoods. They iteratively
refine those distribution parameters and member assignments following the
Expectation-Maximization (EM) algorithm. However, the cluster representability
of such hand-designed distributions that employ a limited amount of parameters
is not adequate for most real-world clustering tasks. In this paper, we realize
mixture model-based clustering with a neural network where the final layer
neurons, with the aid of an additional transformation, approximate cluster
distribution outputs. The network parameters pose as the parameters of those
distributions. The result is an elegant, much-generalized representation of
clusters than a restricted mixture of hand-designed distributions. We train the
network end-to-end via batch-wise EM iterations where the forward pass acts as
the E-step and the backward pass acts as the M-step. In image clustering, the
mixture-based EM objective can be used as the clustering objective along with
existing representation learning methods. In particular, we show that when
mixture-EM optimization is fused with consistency optimization, it improves the
sole consistency optimization performance in clustering. Our trained networks
outperform single-stage deep clustering methods that still depend on k-means,
with unsupervised classification accuracy of 63.8% in STL10, 58% in CIFAR10,
25.9% in CIFAR100, and 98.9% in MNIST.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Image Clustering Algorithm Based on Self-Supervised Pretrained Models and Latent Feature Distribution Optimization [4.39139858370436]
This paper introduces an image clustering algorithm based on self-supervised pretrained models and latent feature distribution optimization.
Our approach outperforms the latest clustering algorithms and achieves state-of-the-art clustering results.
arXiv Detail & Related papers (2024-08-04T04:08:21Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z) - clusterBMA: Bayesian model averaging for clustering [1.2021605201770345]
We introduce clusterBMA, a method that enables weighted model averaging across results from unsupervised clustering algorithms.
We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model.
In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters.
arXiv Detail & Related papers (2022-09-09T04:55:20Z) - Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering.
Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z) - Efficient Large-Scale Face Clustering Using an Online Mixture of
Gaussians [1.3101369903953806]
We present an online gaussian mixture-based clustering method (OGMC) for large-scale online face clustering.
Using feature vectors (f-vectors) extracted from the incoming faces, OGMC generates clusters that may be connected to others depending on their proximity and robustness.
Experimental results show that the proposed approach outperforms state-of-the-art clustering methods on large-scale face clustering benchmarks.
arXiv Detail & Related papers (2021-03-31T17:59:38Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Contrastive Clustering [57.71729650297379]
We propose Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning.
In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19% (39%) performance improvement compared with the best baseline.
arXiv Detail & Related papers (2020-09-21T08:54:40Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z) - Improving k-Means Clustering Performance with Disentangled Internal
Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder.
Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.