Related papers: Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering

Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering

URL: http://arxiv.org/abs/2107.02453v1
Date: Tue, 6 Jul 2021 08:00:58 GMT
Title: Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering
Authors: Dumindu Tissera, Kasun Vithanage, Rukshan Wijesinghe, Alex Xavier, Sanath Jayasena, Subha Fernando, Ranga Rodrigo
Abstract summary: In this paper, we realize mixture model-based clustering with a neural network. We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step. Our trained networks outperform single-stage deep clustering methods that still depend on k-means.
Score: 0.8543753708890495
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Any clustering algorithm must synchronously learn to model the clusters and allocate data to those clusters in the absence of labels. Mixture model-based methods model clusters with pre-defined statistical distributions and allocate data to those clusters based on the cluster likelihoods. They iteratively refine those distribution parameters and member assignments following the Expectation-Maximization (EM) algorithm. However, the cluster representability of such hand-designed distributions that employ a limited amount of parameters is not adequate for most real-world clustering tasks. In this paper, we realize mixture model-based clustering with a neural network where the final layer neurons, with the aid of an additional transformation, approximate cluster distribution outputs. The network parameters pose as the parameters of those distributions. The result is an elegant, much-generalized representation of clusters than a restricted mixture of hand-designed distributions. We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step. In image clustering, the mixture-based EM objective can be used as the clustering objective along with existing representation learning methods. In particular, we show that when mixture-EM optimization is fused with consistency optimization, it improves the sole consistency optimization performance in clustering. Our trained networks outperform single-stage deep clustering methods that still depend on k-means, with unsupervised classification accuracy of 63.8% in STL10, 58% in CIFAR10, 25.9% in CIFAR100, and 98.9% in MNIST.

Related papers

Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
Deep Generative Clustering with VAEs and Expectation-Maximization [1.8416014644193066]
We propose a novel deep clustering method that integrates Variational Autoencoders (VAEs) into the Expectation-Maximization framework. Our approach models the probability distribution of each cluster with a VAE and alternates between updating model parameters. This enables effective clustering and generation of new samples from each cluster.
arXiv Detail & Related papers (2025-01-13T14:26:39Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Image Clustering Algorithm Based on Self-Supervised Pretrained Models and Latent Feature Distribution Optimization [4.39139858370436]
This paper introduces an image clustering algorithm based on self-supervised pretrained models and latent feature distribution optimization. Our approach outperforms the latest clustering algorithms and achieves state-of-the-art clustering results.
arXiv Detail & Related papers (2024-08-04T04:08:21Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
A One-shot Framework for Distributed Clustered Learning in Heterogeneous Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments. One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees. For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z)
clusterBMA: Bayesian model averaging for clustering [1.2021605201770345]
We introduce clusterBMA, a method that enables weighted model averaging across results from unsupervised clustering algorithms. We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model. In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters.
arXiv Detail & Related papers (2022-09-09T04:55:20Z)
Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering. Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z)
Efficient Large-Scale Face Clustering Using an Online Mixture of Gaussians [1.3101369903953806]
We present an online gaussian mixture-based clustering method (OGMC) for large-scale online face clustering. Using feature vectors (f-vectors) extracted from the incoming faces, OGMC generates clusters that may be connected to others depending on their proximity and robustness. Experimental results show that the proposed approach outperforms state-of-the-art clustering methods on large-scale face clustering benchmarks.
arXiv Detail & Related papers (2021-03-31T17:59:38Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
Contrastive Clustering [57.71729650297379]
We propose Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning. In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19% (39%) performance improvement compared with the best baseline.
arXiv Detail & Related papers (2020-09-21T08:54:40Z)
LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset. Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation. We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)
Improving k-Means Clustering Performance with Disentangled Internal Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder. Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.