Improved Design of Quadratic Discriminant Analysis Classifier in
Unbalanced Settings
- URL: http://arxiv.org/abs/2006.06355v3
- Date: Mon, 14 Sep 2020 11:30:27 GMT
- Title: Improved Design of Quadratic Discriminant Analysis Classifier in
Unbalanced Settings
- Authors: Amine Bejaoui, Khalil Elkhalil, Abla Kammoun, Mohamed Slim Alouni,
Tarek Al-Naffouri
- Abstract summary: quadratic discriminant analysis (QDA) or its regularized version (R-QDA) for classification is often not recommended.
We propose an improved R-QDA that is based on the use of two regularization parameters and a modified bias.
- Score: 19.763768111774134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of quadratic discriminant analysis (QDA) or its regularized version
(R-QDA) for classification is often not recommended, due to its
well-acknowledged high sensitivity to the estimation noise of the covariance
matrix. This becomes all the more the case in unbalanced data settings for
which it has been found that R-QDA becomes equivalent to the classifier that
assigns all observations to the same class. In this paper, we propose an
improved R-QDA that is based on the use of two regularization parameters and a
modified bias, properly chosen to avoid inappropriate behaviors of R-QDA in
unbalanced settings and to ensure the best possible classification performance.
The design of the proposed classifier builds on a refined asymptotic analysis
of its performance when the number of samples and that of features grow large
simultaneously, which allows to cope efficiently with the high-dimensionality
frequently met within the big data paradigm. The performance of the proposed
classifier is assessed on both real and synthetic data sets and was shown to be
much better than what one would expect from a traditional R-QDA.
Related papers
- Balanced Classification: A Unified Framework for Long-Tailed Object
Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories.
We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution.
BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Adaptive Dimension Reduction and Variational Inference for Transductive
Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction.
Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance
Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized.
Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z) - A Doubly Regularized Linear Discriminant Analysis Classifier with
Automatic Parameter Selection [24.027886914804775]
Linear discriminant analysis (LDA) based classifiers tend to falter in many practical settings where the training data size is smaller than, or comparable to, the number of features.
We propose a doubly regularized LDA classifier that we denote as R2LDA.
Results obtained from both synthetic and real data demonstrate the consistency and effectiveness of the proposed R2LDA approach.
arXiv Detail & Related papers (2020-04-28T07:09:22Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Robust Generalised Quadratic Discriminant Analysis [6.308539010172309]
The classification rule in GQDA is based on the sample mean vector and the sample dispersion matrix of a training sample, which are extremely non-robust under data contamination.
The present paper investigates the performance of the GQDA classifier when the classical estimators of the mean vector and the dispersion matrix used therein are replaced by various robust counterparts.
arXiv Detail & Related papers (2020-04-11T18:21:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.