Clustering of non-Gaussian data by variational Bayes for normal inverse
Gaussian mixture models
- URL: http://arxiv.org/abs/2009.06002v1
- Date: Sun, 13 Sep 2020 14:13:27 GMT
- Title: Clustering of non-Gaussian data by variational Bayes for normal inverse
Gaussian mixture models
- Authors: Takashi Takekawa
- Abstract summary: In practical situations, there are many non-Gaussian data that are heavy-tailed and/or asymmetric.
For NIG mixture models, both expectation-maximization method and variational Bayesian (VB) algorithms have been proposed.
We propose another VB algorithm for NIG mixture that improves on the shortcomings.
We also propose an extension of Dirichlet process mixture models to overcome the difficulty in determining the number of clusters.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finite mixture models, typically Gaussian mixtures, are well known and widely
used as model-based clustering. In practical situations, there are many
non-Gaussian data that are heavy-tailed and/or asymmetric. Normal inverse
Gaussian (NIG) distributions are normal-variance mean which mixing densities
are inverse Gaussian distributions and can be used for both haavy-tail and
asymmetry. For NIG mixture models, both expectation-maximization method and
variational Bayesian (VB) algorithms have been proposed. However, the existing
VB algorithm for NIG mixture have a disadvantage that the shape of the mixing
density is limited. In this paper, we propose another VB algorithm for NIG
mixture that improves on the shortcomings. We also propose an extension of
Dirichlet process mixture models to overcome the difficulty in determining the
number of clusters in finite mixture models. We evaluated the performance with
artificial data and found that it outperformed Gaussian mixtures and existing
implementations for NIG mixtures, especially for highly non-normative data.
Related papers
- The Breakdown of Gaussian Universality in Classification of High-dimensional Mixtures [6.863637695977277]
We provide a high-dimensional characterization of empirical risk minimization for classification under a general mixture data setting.
We specify conditions for Gaussian universality and discuss their implications for the choice of loss function.
arXiv Detail & Related papers (2024-10-08T01:45:37Z) - Adaptive Fuzzy C-Means with Graph Embedding [84.47075244116782]
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
We propose a novel FCM based clustering model that is capable of automatically learning an appropriate membership degree hyper- parameter value.
arXiv Detail & Related papers (2024-05-22T08:15:50Z) - Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models [8.097200145973389]
We first establish a universal lower bound for the error rate in clustering any mixture model.
We then demonstrate that iterative algorithms attain this lower bound in mixture models with sub-exponential tails.
For datasets better modelled by Poisson or Negative Binomial mixtures, we study mixture models whose distributions belong to an exponential family.
In such mixtures, we establish that Bregman hard clustering, a variant of Lloyd's algorithm employing a Bregman divergence, is rate optimal.
arXiv Detail & Related papers (2024-02-23T16:51:17Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Gaussian Mixture Convolution Networks [13.493166990188278]
This paper proposes a novel method for deep learning based on the analytical convolution of multidimensional Gaussian mixtures.
We demonstrate that networks based on this architecture reach competitive accuracy on Gaussian mixtures fitted to the MNIST and ModelNet data sets.
arXiv Detail & Related papers (2022-02-18T12:07:52Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Gaussian Mixture Reduction with Composite Transportation Divergence [15.687740538194413]
We propose a novel optimization-based GMR method based on composite transportation divergence (CTD)
We develop a majorization-minimization algorithm for computing the reduced mixture and establish its theoretical convergence.
Our unified framework empowers users to select the most appropriate cost function in CTD to achieve superior performance.
arXiv Detail & Related papers (2020-02-19T19:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.