Asymptotic Supervised Predictive Classifiers under Partition
Exchangeability
- URL: http://arxiv.org/abs/2101.10950v1
- Date: Tue, 26 Jan 2021 17:17:40 GMT
- Title: Asymptotic Supervised Predictive Classifiers under Partition
Exchangeability
- Authors: Ali Amiryousefi
- Abstract summary: The result shows the convergence of these classifiers under infinite amount of training or test data.
This is an important result from the practical perspective as under the presence of sufficiently large amount of data, one can replace the simpler marginal classifier with computationally more expensive simultaneous one.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The convergence of simultaneous and marginal predictive classifiers under
partition exchangeability in supervised classification is obtained. The result
shows the asymptotic convergence of these classifiers under infinite amount of
training or test data, such that after observing umpteen amount of data, the
differences between these classifiers would be negligible. This is an important
result from the practical perspective as under the presence of sufficiently
large amount of data, one can replace the simpler marginal classifier with
computationally more expensive simultaneous one.
Related papers
- Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Imbalanced Classification via a Tabular Translation GAN [4.864819846886142]
We present a model based on Generative Adversarial Networks which uses additional regularization losses to map majority samples to corresponding synthetic minority samples.
We show that the proposed method improves average precision when compared to alternative re-weighting and oversampling techniques.
arXiv Detail & Related papers (2022-04-19T06:02:53Z) - On the rate of convergence of a classifier based on a Transformer
encoder [55.41148606254641]
The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed.
It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model.
arXiv Detail & Related papers (2021-11-29T14:58:29Z) - On Clustering Categories of Categorical Predictors in Generalized Linear
Models [0.0]
We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors.
The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting.
This paper addresses these challenges by finding a reduced representation of the categorical predictors by clustering their categories.
arXiv Detail & Related papers (2021-10-19T15:36:35Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Statistical Theory for Imbalanced Binary Classification [8.93993657323783]
We show that optimal classification performance depends on certain properties of class imbalance that have not previously been formalized.
Specifically, we propose a novel sub-type of class imbalance, which we call Uniform Class Imbalance.
These results provide some of the first meaningful finite-sample statistical theory for imbalanced binary classification.
arXiv Detail & Related papers (2021-07-05T03:55:43Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Predictive Value Generalization Bounds [27.434419027831044]
We study a bi-criterion framework for assessing scoring functions in the context of binary classification.
We study properties of scoring functions with respect to predictive values by deriving new distribution-free large deviation and uniform convergence bounds.
arXiv Detail & Related papers (2020-07-09T21:23:28Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z) - To Split or Not to Split: The Impact of Disparate Treatment in
Classification [8.325775867295814]
Disparate treatment occurs when a machine learning model yields different decisions for individuals based on a sensitive attribute.
We introduce the benefit-of-splitting for quantifying the performance improvement by splitting classifiers.
We prove an equivalent expression for the benefit-of-splitting which can be efficiently computed by solving small-scale convex programs.
arXiv Detail & Related papers (2020-02-12T04:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.