Generalization Error Bounds for Multiclass Sparse Linear Classifiers
- URL: http://arxiv.org/abs/2204.06264v1
- Date: Wed, 13 Apr 2022 09:25:03 GMT
- Title: Generalization Error Bounds for Multiclass Sparse Linear Classifiers
- Authors: Tomer Levy and Felix Abramovich
- Abstract summary: We consider high-dimensional multiclass classification by sparse multinomial logistic regression.
We propose a computationally feasible feature selection procedure based on penalized maximum likelihood.
In particular, we consider global sparsity, double row-wise sparsity, and low-rank sparsity.
- Score: 7.360807642941714
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We consider high-dimensional multiclass classification by sparse multinomial
logistic regression. Unlike binary classification, in the multiclass setup one
can think about an entire spectrum of possible notions of sparsity associated
with different structural assumptions on the regression coefficients matrix. We
propose a computationally feasible feature selection procedure based on
penalized maximum likelihood with convex penalties capturing a specific type of
sparsity at hand. In particular, we consider global sparsity, double row-wise
sparsity, and low-rank sparsity, and show that with the properly chosen tuning
parameters the derived plug-in classifiers attain the minimax generalization
error bounds (in terms of misclassification excess risk) within the
corresponding classes of multiclass sparse linear classifiers. The developed
approach is general and can be adapted to other types of sparsity as well.
Related papers
- Precise Asymptotic Generalization for Multiclass Classification with
Overparameterized Linear Models [4.093769373833101]
We resolve the conjecture posed in Subramanian et al.'22, where the number of data points, features, and classes all grow together.
Our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1ally.
The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels.
arXiv Detail & Related papers (2023-06-23T00:59:15Z) - Generalization for multiclass classification with overparameterized
linear models [3.3434274586532515]
We show that multiclass classification behaves like binary classification in that, as long as there are not too many classes, it is possible to generalize well.
Besides various technical challenges, it turns out that the key difference from the binary classification setting is that there are relatively fewer positive training examples of each class in the multiclass setting as the number of classes increases.
arXiv Detail & Related papers (2022-06-03T05:52:43Z) - Soft-margin classification of object manifolds [0.0]
A neural population responding to multiple appearances of a single object defines a manifold in the neural response space.
The ability to classify such manifold is of interest, as object recognition and other computational tasks require a response that is insensitive to variability within a manifold.
Soft-margin classifiers are a larger class of algorithms and provide an additional regularization parameter used in applications to optimize performance outside the training set.
arXiv Detail & Related papers (2022-03-14T12:23:36Z) - On the rate of convergence of a classifier based on a Transformer
encoder [55.41148606254641]
The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed.
It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model.
arXiv Detail & Related papers (2021-11-29T14:58:29Z) - On Clustering Categories of Categorical Predictors in Generalized Linear
Models [0.0]
We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors.
The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting.
This paper addresses these challenges by finding a reduced representation of the categorical predictors by clustering their categories.
arXiv Detail & Related papers (2021-10-19T15:36:35Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Self-Weighted Robust LDA for Multiclass Classification with Edge Classes [111.5515086563592]
A novel self-weighted robust LDA with l21-norm based between-class distance criterion, called SWRLDA, is proposed for multi-class classification.
The proposed SWRLDA is easy to implement, and converges fast in practice.
arXiv Detail & Related papers (2020-09-24T12:32:55Z) - Saliency-based Weighted Multi-label Linear Discriminant Analysis [101.12909759844946]
We propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks.
The proposed method is based on a probabilistic model for defining the weights of individual samples.
The Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
arXiv Detail & Related papers (2020-04-08T19:40:53Z) - Multiclass classification by sparse multinomial logistic regression [10.312968200748116]
We consider high-dimensional multiclass classification by sparse multinomial logistic regression.
We propose a feature selection procedure based on penalized maximum likelihood with a complexity penalty.
We show that there exist two regimes corresponding to small and large number of classes.
arXiv Detail & Related papers (2020-03-04T08:44:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.