FiMReSt: Finite Mixture of Multivariate Regulated Skew-t Kernels -- A
Flexible Probabilistic Model for Multi-Clustered Data with
Asymmetrically-Scattered Non-Gaussian Kernels
- URL: http://arxiv.org/abs/2305.09071v1
- Date: Mon, 15 May 2023 23:53:59 GMT
- Title: FiMReSt: Finite Mixture of Multivariate Regulated Skew-t Kernels -- A
Flexible Probabilistic Model for Multi-Clustered Data with
Asymmetrically-Scattered Non-Gaussian Kernels
- Authors: Sarmad Mehrdad, S. Farokh Atashzar
- Abstract summary: We propose a regularized iterative optimization process to train the mixture model, enhancing the generalizability and power for modeling the skews.
The resulting mixture model is named Finite Mixture of Multi Regulated Skewt (FiMStss)
To validate the performance, we have conducted a comprehensive experiment on several real-world datasets and a synthetic dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently skew-t mixture models have been introduced as a flexible
probabilistic modeling technique taking into account both skewness in data
clusters and the statistical degree of freedom (S-DoF) to improve modeling
generalizability, and robustness to heavy tails and skewness. In this paper, we
show that the state-of-the-art skew-t mixture models fundamentally suffer from
a hidden phenomenon named here as "S-DoF explosion," which results in local
minima in the shapes of normal kernels during the non-convex iterative process
of expectation maximization. For the first time, this paper provides insights
into the instability of the S-DoF, which can result in the divergence of the
kernels from the mixture of t-distribution, losing generalizability and power
for modeling the outliers. Thus, in this paper, we propose a regularized
iterative optimization process to train the mixture model, enhancing the
generalizability and resiliency of the technique. The resulting mixture model
is named Finite Mixture of Multivariate Regulated Skew-t (FiMReSt) Kernels,
which stabilizes the S-DoF profile during optimization process of learning. To
validate the performance, we have conducted a comprehensive experiment on
several real-world datasets and a synthetic dataset. The results highlight (a)
superior performance of the FiMReSt, (b) generalizability in the presence of
outliers, and (c) convergence of S-DoF.
Related papers
- Marginalization Consistent Mixture of Separable Flows for Probabilistic Irregular Time Series Forecasting [4.714246221974192]
We develop a novel probabilistic irregular time series forecasting model, Marginalization Consistent Mixtures of Separable Flows (moses)
moses outperforms other state-of-the-art marginalization consistent models, performs on par with ProFITi, but different from ProFITi, guarantee marginalization consistency.
arXiv Detail & Related papers (2024-06-11T13:28:43Z) - Adaptive Fuzzy C-Means with Graph Embedding [84.47075244116782]
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
We propose a novel FCM based clustering model that is capable of automatically learning an appropriate membership degree hyper- parameter value.
arXiv Detail & Related papers (2024-05-22T08:15:50Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Stable Training of Probabilistic Models Using the Leave-One-Out Maximum Log-Likelihood Objective [0.7373617024876725]
Kernel density estimation (KDE) based models are popular choices for this task, but they fail to adapt to data regions with varying densities.
An adaptive KDE model is employed to circumvent this, where each kernel in the model has an individual bandwidth.
A modified expectation-maximization algorithm is employed to accelerate the optimization speed reliably.
arXiv Detail & Related papers (2023-10-05T14:08:42Z) - Learning Multivariate CDFs and Copulas using Tensor Factorization [39.24470798045442]
Learning the multivariate distribution of data is a core challenge in statistics and machine learning.
In this work, we aim to learn multivariate cumulative distribution functions (CDFs), as they can handle mixed random variables.
We show that any grid sampled version of a joint CDF of mixed random variables admits a universal representation as a naive Bayes model.
We demonstrate the superior performance of the proposed model in several synthetic and real datasets and applications including regression, sampling and data imputation.
arXiv Detail & Related papers (2022-10-13T16:18:46Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Spectral Unmixing With Multinomial Mixture Kernel and Wasserstein
Generative Adversarial Loss [4.56877715768796]
This study proposes a novel framework for spectral unmixing by using 1D convolution kernels and spectral uncertainty.
High-level representations are computed from data, and they are further modeled with the Multinomial Mixture Model.
Experiments are performed on both real and synthetic datasets.
arXiv Detail & Related papers (2020-12-12T16:49:01Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging.
We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence.
Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.