Image Modeling with Deep Convolutional Gaussian Mixture Models
- URL: http://arxiv.org/abs/2104.12686v1
- Date: Mon, 19 Apr 2021 12:08:53 GMT
- Title: Image Modeling with Deep Convolutional Gaussian Mixture Models
- Authors: Alexander Gepperth, Benedikt Pf\"ulb
- Abstract summary: We present a new formulation of deep hierarchical Gaussian Mixture Models (GMMs) that is suitable for describing and generating images.
DCGMMs avoid this by a stacked architecture of multiple GMM layers, linked by convolution and pooling operations.
For generating sharp images with DCGMMs, we introduce a new gradient-based technique for sampling through non-invertible operations like convolution and pooling.
Based on the MNIST and FashionMNIST datasets, we validate the DCGMMs model by demonstrating its superiority over flat GMMs for clustering, sampling and outlier detection.
- Score: 79.0660895390689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this conceptual work, we present Deep Convolutional Gaussian Mixture
Models (DCGMMs): a new formulation of deep hierarchical Gaussian Mixture Models
(GMMs) that is particularly suitable for describing and generating images.
Vanilla (i.e., flat) GMMs require a very large number of components to describe
images well, leading to long training times and memory issues. DCGMMs avoid
this by a stacked architecture of multiple GMM layers, linked by convolution
and pooling operations. This allows to exploit the compositionality of images
in a similar way as deep CNNs do. DCGMMs can be trained end-to-end by
Stochastic Gradient Descent. This sets them apart from vanilla GMMs which are
trained by Expectation-Maximization, requiring a prior k-means initialization
which is infeasible in a layered structure. For generating sharp images with
DCGMMs, we introduce a new gradient-based technique for sampling through
non-invertible operations like convolution and pooling. Based on the MNIST and
FashionMNIST datasets, we validate the DCGMMs model by demonstrating its
superiority over flat GMMs for clustering, sampling and outlier detection.
Related papers
- Performance of Gaussian Mixture Model Classifiers on Embedded Feature Spaces [1.3241991482253108]
Data embeddings with CLIP and ImageBind provide powerful features for the analysis of multimedia and/or multimodal data.
We assess their performance here for classification using a Gaussian Mixture models (GMMs) based layer as an alternative to the standard Softmax layer.
Our findings are, that in most cases, one gaussian component in the GMMs is often enough for capturing each class, and we hypothesize that this may be due to the contrastive loss used for training these embedded spaces.
arXiv Detail & Related papers (2024-10-17T10:43:43Z) - Deep Gaussian mixture model for unsupervised image segmentation [1.3654846342364308]
In many tasks sufficient pixel-level labels are very difficult to obtain.
We propose a method which combines a Gaussian mixture model (GMM) with unsupervised deep learning techniques.
We demonstrate the advantages of our method in various experiments on the example of infarct segmentation on multi-sequence MRI images.
arXiv Detail & Related papers (2024-04-18T15:20:59Z) - Incremental Multimodal Surface Mapping via Self-Organizing Gaussian
Mixture Models [1.0878040851638]
This letter describes an incremental multimodal surface mapping methodology, which represents the environment as a continuous probabilistic model.
The strategy employed in this work utilizes Gaussian mixture models (GMMs) to represent the environment.
To bridge this gap, this letter introduces a spatial hash map for rapid GMM submap extraction combined with an approach to determine relevant and redundant data in a point cloud.
arXiv Detail & Related papers (2023-09-19T19:49:03Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - Smoothed Gaussian Mixture Models for Video Classification and
Recommendation [10.119117405418868]
We propose a new cluster-and-aggregate method which we call smoothed Gaussian mixture model (SGMM)
We show, through extensive experiments on the YouTube-8M classification task, that SGMM/DSGMM is consistently better than VLAD/NetVLAD by a small but statistically significant margin.
arXiv Detail & Related papers (2020-12-17T06:52:41Z) - Prototype Mixture Models for Few-shot Semantic Segmentation [50.866870384596446]
Few-shot segmentation is challenging because objects within the support and query images could significantly differ in appearance and pose.
We propose prototype mixture models (PMMs), which correlate diverse image regions with multiple prototypes to enforce the prototype-based semantic representation.
PMMs improve 5-shot segmentation performance on MS-COCO by up to 5.82% with only a moderate cost for model size and inference speed.
arXiv Detail & Related papers (2020-08-10T04:33:17Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - PointGMM: a Neural GMM Network for Point Clouds [83.9404865744028]
Point clouds are popular representation for 3D shapes, but encode a particular sampling without accounting for shape priors or non-local information.
We present PointGMM, a neural network that learns to generate hGMMs which are characteristic of the shape class.
We show that as a generative model, PointGMM learns a meaningful latent space which enables generating consistents between existing shapes.
arXiv Detail & Related papers (2020-03-30T10:34:59Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.