Mixture Model Auto-Encoders: Deep Clustering through Dictionary Learning
- URL: http://arxiv.org/abs/2110.04683v1
- Date: Sun, 10 Oct 2021 02:30:31 GMT
- Title: Mixture Model Auto-Encoders: Deep Clustering through Dictionary Learning
- Authors: Alexander Lin, Andrew H. Song, Demba Ba
- Abstract summary: Mixture Model Auto-Encoders (MixMate) is a novel architecture that clusters data by performing inference on a generative model.
We show that MixMate achieves competitive performance compared to state-of-the-art deep clustering algorithms.
- Score: 72.9458277424712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art approaches for clustering high-dimensional data utilize deep
auto-encoder architectures. Many of these networks require a large number of
parameters and suffer from a lack of interpretability, due to the black-box
nature of the auto-encoders. We introduce Mixture Model Auto-Encoders
(MixMate), a novel architecture that clusters data by performing inference on a
generative model. Derived from the perspective of sparse dictionary learning
and mixture models, MixMate comprises several auto-encoders, each tasked with
reconstructing data in a distinct cluster, while enforcing sparsity in the
latent space. Through experiments on various image datasets, we show that
MixMate achieves competitive performance compared to state-of-the-art deep
clustering algorithms, while using orders of magnitude fewer parameters.
Related papers
- Adversarial AutoMixup [50.1874436169571]
We propose AdAutomixup, an adversarial automatic mixup augmentation approach.
It generates challenging samples to train a robust classifier for image classification.
Our approach outperforms the state of the art in various classification scenarios.
arXiv Detail & Related papers (2023-12-19T08:55:00Z) - Optimizations of Autoencoders for Analysis and Classification of
Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression.
The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z) - DoubleMix: Simple Interpolation-Based Data Augmentation for Text
Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix.
DoubleMix first generates several perturbed samples for each training data.
It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z) - ClusTR: Exploring Efficient Self-attention via Clustering for Vision
Transformers [70.76313507550684]
We propose a content-based sparse attention method, as an alternative to dense self-attention.
Specifically, we cluster and then aggregate key and value tokens, as a content-based method of reducing the total token count.
The resulting clustered-token sequence retains the semantic diversity of the original signal, but can be processed at a lower computational cost.
arXiv Detail & Related papers (2022-08-28T04:18:27Z) - Enhancing Latent Space Clustering in Multi-filter Seq2Seq Model: A
Reinforcement Learning Approach [0.0]
We design a latent-enhanced multi-filter seq2seq model (LMS2S) that analyzes the latent space representations using a clustering algorithm.
Our experiments on semantic parsing and machine translation demonstrate the positive correlation between the clustering quality and the model's performance.
arXiv Detail & Related papers (2021-09-25T16:36:31Z) - Multi-Facet Clustering Variational Autoencoders [9.150555507030083]
High-dimensional data, such as images, typically feature multiple interesting characteristics one could cluster over.
We introduce Multi-Facet Clustering Variational Autoencoders (MFCVAE)
MFCVAE learns multiple clusterings simultaneously, and is trained fully unsupervised and end-to-end.
arXiv Detail & Related papers (2021-06-09T17:36:38Z) - Event-Driven News Stream Clustering using Entity-Aware Contextual
Embeddings [14.225334321146779]
We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm.
Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations.
We show that the use of a suitable fine-tuning objective and external knowledge in pre-trained transformer models yields significant improvements in the effectiveness of contextual embeddings.
arXiv Detail & Related papers (2021-01-26T19:58:30Z) - Joint Optimization of an Autoencoder for Clustering and Embedding [22.16059261437617]
We present an alternative where the autoencoder and the clustering are learned simultaneously.
That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model.
arXiv Detail & Related papers (2020-12-07T14:38:10Z) - Mixed data Deep Gaussian Mixture Model: A clustering model for mixed
datasets [0.0]
We introduce a model-based clustering method called Mixed Deep Gaussian Mixture Model (MDGMM)
This architecture is flexible and can be adapted to mixed as well as to continuous or non-continuous data.
Our model provides continuous low-dimensional representations of the data which can be a useful tool to visualize mixed datasets.
arXiv Detail & Related papers (2020-10-13T19:52:46Z) - Learning Autoencoders with Relational Regularization [89.53065887608088]
A new framework is proposed for learning autoencoders of data distributions.
We minimize the discrepancy between the model and target distributions, with a emphrelational regularization
We implement the framework with two scalable algorithms, making it applicable for both probabilistic and deterministic autoencoders.
arXiv Detail & Related papers (2020-02-07T17:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.