Robust Generalised Quadratic Discriminant Analysis
- URL: http://arxiv.org/abs/2004.06568v1
- Date: Sat, 11 Apr 2020 18:21:06 GMT
- Title: Robust Generalised Quadratic Discriminant Analysis
- Authors: Abhik Ghosh, Rita SahaRay, Sayan Chakrabarty, Sayan Bhadra
- Abstract summary: The classification rule in GQDA is based on the sample mean vector and the sample dispersion matrix of a training sample, which are extremely non-robust under data contamination.
The present paper investigates the performance of the GQDA classifier when the classical estimators of the mean vector and the dispersion matrix used therein are replaced by various robust counterparts.
- Score: 6.308539010172309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quadratic discriminant analysis (QDA) is a widely used statistical tool to
classify observations from different multivariate Normal populations. The
generalized quadratic discriminant analysis (GQDA) classification
rule/classifier, which generalizes the QDA and the minimum Mahalanobis distance
(MMD) classifiers to discriminate between populations with underlying
elliptically symmetric distributions competes quite favorably with the QDA
classifier when it is optimal and performs much better when QDA fails under
non-Normal underlying distributions, e.g. Cauchy distribution. However, the
classification rule in GQDA is based on the sample mean vector and the sample
dispersion matrix of a training sample, which are extremely non-robust under
data contamination. In real world, since it is quite common to face data highly
vulnerable to outliers, the lack of robustness of the classical estimators of
the mean vector and the dispersion matrix reduces the efficiency of the GQDA
classifier significantly, increasing the misclassification errors. The present
paper investigates the performance of the GQDA classifier when the classical
estimators of the mean vector and the dispersion matrix used therein are
replaced by various robust counterparts. Applications to various real data sets
as well as simulation studies reveal far better performance of the proposed
robust versions of the GQDA classifier. A Comparative study has been made to
advocate the appropriate choice of the robust estimators to be used in a
specific situation of the degree of contamination of the data sets.
Related papers
- Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Divide-and-Conquer Hard-thresholding Rules in High-dimensional
Imbalanced Classification [1.0312968200748118]
We study the impact of imbalance class sizes on the linear discriminant analysis (LDA) in high dimensions.
We show that due to data scarcity in one class, referred to as the minority class, the LDA ignores the minority class yielding a maximum misclassification rate.
We propose a new construction of a hard-conquering rule based on a divide-and-conquer technique that reduces the large difference between the misclassification rates.
arXiv Detail & Related papers (2021-11-05T07:44:28Z) - Self-Weighted Robust LDA for Multiclass Classification with Edge Classes [111.5515086563592]
A novel self-weighted robust LDA with l21-norm based between-class distance criterion, called SWRLDA, is proposed for multi-class classification.
The proposed SWRLDA is easy to implement, and converges fast in practice.
arXiv Detail & Related papers (2020-09-24T12:32:55Z) - High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance
Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized.
Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Improved Design of Quadratic Discriminant Analysis Classifier in
Unbalanced Settings [19.763768111774134]
quadratic discriminant analysis (QDA) or its regularized version (R-QDA) for classification is often not recommended.
We propose an improved R-QDA that is based on the use of two regularization parameters and a modified bias.
arXiv Detail & Related papers (2020-06-11T12:17:05Z) - A Compressive Classification Framework for High-Dimensional Data [12.284934135116515]
We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size.
The proposed method, referred to as regularized discriminant analysis (CRDA), is based on linear discriminant analysis.
It has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule.
arXiv Detail & Related papers (2020-05-09T06:55:00Z) - Saliency-based Weighted Multi-label Linear Discriminant Analysis [101.12909759844946]
We propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks.
The proposed method is based on a probabilistic model for defining the weights of individual samples.
The Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
arXiv Detail & Related papers (2020-04-08T19:40:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.