FEMDA: Une m\'ethode de classification robuste et flexible
- URL: http://arxiv.org/abs/2307.01954v1
- Date: Tue, 4 Jul 2023 23:15:31 GMT
- Title: FEMDA: Une m\'ethode de classification robuste et flexible
- Authors: Pierre Houdouin and Matthieu Jonckheere and Frederic Pascal
- Abstract summary: This paper studies robustness to scale changes in the data of a new discriminant analysis technique.
The new decision rule derived is simple, fast, and robust to scale changes in the data compared to other state-of-the-art method.
- Score: 0.8594140167290096
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Linear and Quadratic Discriminant Analysis (LDA and QDA) are well-known
classical methods but can heavily suffer from non-Gaussian distributions and/or
contaminated datasets, mainly because of the underlying Gaussian assumption
that is not robust. This paper studies the robustness to scale changes in the
data of a new discriminant analysis technique where each data point is drawn by
its own arbitrary Elliptically Symmetrical (ES) distribution and its own
arbitrary scale parameter. Such a model allows for possibly very heterogeneous,
independent but non-identically distributed samples. The new decision rule
derived is simple, fast, and robust to scale changes in the data compared to
other state-of-the-art method
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Anomaly Detection Under Uncertainty Using Distributionally Robust
Optimization Approach [0.9217021281095907]
Anomaly detection is defined as the problem of finding data points that do not follow the patterns of the majority.
The one-class Support Vector Machines (SVM) method aims to find a decision boundary to distinguish between normal data points and anomalies.
A distributionally robust chance-constrained model is proposed in which the probability of misclassification is low.
arXiv Detail & Related papers (2023-12-03T06:13:22Z) - FEMDA: a unified framework for discriminant analysis [4.6040036610482655]
We present a novel approach to deal with non-Gaussian datasets.
The model considered is an arbitraryly Symmetrical (ES) distribution per cluster with its own arbitrary scale parameter.
By deriving a new decision rule, we demonstrate that maximum-likelihood parameter estimation and classification are simple, efficient, and robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-11-13T17:59:37Z) - Sketched Gaussian Model Linear Discriminant Analysis via the Randomized
Kaczmarz Method [7.593861427248019]
We present sketched linear discriminant analysis, an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data.
We harness a least squares formulation and mobilize the descent gradient framework.
We present convergence guarantees for the sketched predictions on new data within a fixed number of iterations.
arXiv Detail & Related papers (2022-11-10T18:29:36Z) - Predicting Out-of-Domain Generalization with Neighborhood Invariance [59.05399533508682]
We propose a measure of a classifier's output invariance in a local transformation neighborhood.
Our measure is simple to calculate, does not depend on the test point's true label, and can be applied even in out-of-domain (OOD) settings.
In experiments on benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our measure and actual OOD generalization.
arXiv Detail & Related papers (2022-07-05T14:55:16Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Robust classification with flexible discriminant analysis in
heterogeneous data [0.7646713951724009]
This paper presents a new robust discriminant analysis where each data point is drawn by its own arbitrary scale parameter.
It is shown that maximum-likelihood parameter estimation and classification are very simple, fast and robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-01-09T09:22:56Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.