Analysis of Estimating the Bayes Rule for Gaussian Mixture Models with a
Specified Missing-Data Mechanism
- URL: http://arxiv.org/abs/2210.13785v2
- Date: Fri, 29 Dec 2023 14:05:33 GMT
- Title: Analysis of Estimating the Bayes Rule for Gaussian Mixture Models with a
Specified Missing-Data Mechanism
- Authors: Ziyang Lyu
- Abstract summary: Semi-supervised learning (SSL) approaches have been successfully applied in a wide range of engineering and scientific fields.
This paper investigates the generative model framework with a missingness mechanism for unclassified observations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-supervised learning (SSL) approaches have been successfully applied in a
wide range of engineering and scientific fields. This paper investigates the
generative model framework with a missingness mechanism for unclassified
observations, as introduced by Ahfock and McLachlan(2020). We show that in a
partially classified sample, a classifier using Bayes rule of allocation with a
missing-data mechanism can surpass a fully supervised classifier in a two-class
normal homoscedastic model, especially with moderate to low overlap and
proportion of missing class labels, or with large overlap but few missing
labels. It also outperforms a classifier with no missing-data mechanism
regardless of the overlap region or the proportion of missing class labels. Our
exploration of two- and three-component normal mixture models with unequal
covariances through simulations further corroborates our findings. Finally, we
illustrate the use of the proposed classifier with a missing-data mechanism on
interneuronal and skin lesion datasets.
Related papers
- Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Imbalanced Classification via a Tabular Translation GAN [4.864819846886142]
We present a model based on Generative Adversarial Networks which uses additional regularization losses to map majority samples to corresponding synthetic minority samples.
We show that the proposed method improves average precision when compared to alternative re-weighting and oversampling techniques.
arXiv Detail & Related papers (2022-04-19T06:02:53Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Deep Generative Pattern-Set Mixture Models for Nonignorable Missingness [0.0]
We propose a variational autoencoder architecture to model both ignorable and nonignorable missing data.
Our model explicitly learns to cluster the missing data into missingness pattern sets based on the observed data and missingness masks.
Our setup trades off the characteristics of ignorable and nonignorable missingness and can thus be applied to data of both types.
arXiv Detail & Related papers (2021-03-05T08:21:35Z) - Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning [49.04790688256481]
The goal of generalized zero-shot learning (GZSL) is to recognise both seen and unseen classes.
Most GZSL methods typically learn to synthesise visual representations from semantic information on the unseen classes.
We propose a novel framework that leverages dual variational autoencoders with a triplet loss to learn discriminative latent features.
arXiv Detail & Related papers (2021-01-09T05:21:27Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Open-Set Recognition with Gaussian Mixture Variational Autoencoders [91.3247063132127]
In inference, open-set classification is to either classify a sample into a known class from training or reject it as an unknown class.
We train our model to cooperatively learn reconstruction and perform class-based clustering in the latent space.
Our model achieves more accurate and robust open-set classification results, with an average F1 improvement of 29.5%.
arXiv Detail & Related papers (2020-06-03T01:15:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.