Approximation and generalization properties of the random projection classification method
- URL: http://arxiv.org/abs/2108.06339v4
- Date: Wed, 11 Sep 2024 17:07:38 GMT
- Title: Approximation and generalization properties of the random projection classification method
- Authors: Mireille Boutin, Evzenie Coupkova,
- Abstract summary: We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature.
For certain classification problems (e.g., those with a large Rashomon ratio), there is a potntially large gain in generalization properties by selecting parameters at random.
- Score: 0.4604003661048266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization gap of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature. The feature is obtained by projecting the data on a random line after embedding it into a higher-dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n, based on its performance on training data, is chosen. We show that this type of classifier is extremely flexible as, given full knowledge of the class conditional densities, under mild conditions, the error of these classifiers would converge to the optimal (Bayes) error as k and n go to infinity. We also bound the generalization gap of the random classifiers. In general, these bounds are better than those for any classifier with VC dimension greater than O(ln n). In particular, the bounds imply that, unless the number of projections n is extremely large, the generalization gap of the random projection approach is significantly smaller than that of a linear classifier in the extended space. Thus, for certain classification problems (e.g., those with a large Rashomon ratio), there is a potntially large gain in generalization properties by selecting parameters at random, rather than selecting the best one amongst the class.
Related papers
- How many classifiers do we need? [50.69951049206484]
We provide a detailed analysis of how the disagreement and the polarization among classifiers relate to the performance gain achieved by aggregating individual classifiers.
We prove results for the behavior of the disagreement in terms of the number of classifiers.
Our theories and claims are supported by empirical results on several image classification tasks with various types of neural networks.
arXiv Detail & Related papers (2024-11-01T02:59:56Z) - Classification Using Global and Local Mahalanobis Distances [1.7811840395202345]
We propose a novel semiparametric classifier based on Mahalanobis distances of an observation from the competing classes.
Our tool is a generalized additive model with the logistic link function that uses these distances as features to estimate the posterior probabilities of different classes.
arXiv Detail & Related papers (2024-02-13T08:22:42Z) - Precise Asymptotic Generalization for Multiclass Classification with
Overparameterized Linear Models [4.093769373833101]
We resolve the conjecture posed in Subramanian et al.'22, where the number of data points, features, and classes all grow together.
Our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1ally.
The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels.
arXiv Detail & Related papers (2023-06-23T00:59:15Z) - GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models [74.0430727476634]
We propose a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature,class)
With a variety of segmentation architectures and backbones, GMMSeg outperforms the discriminative counterparts on closed-set datasets.
GMMSeg even performs well on open-world datasets.
arXiv Detail & Related papers (2022-10-05T05:20:49Z) - Soft-margin classification of object manifolds [0.0]
A neural population responding to multiple appearances of a single object defines a manifold in the neural response space.
The ability to classify such manifold is of interest, as object recognition and other computational tasks require a response that is insensitive to variability within a manifold.
Soft-margin classifiers are a larger class of algorithms and provide an additional regularization parameter used in applications to optimize performance outside the training set.
arXiv Detail & Related papers (2022-03-14T12:23:36Z) - Local versions of sum-of-norms clustering [77.34726150561087]
We show that our method can separate arbitrarily close balls in the ball model.
We prove a quantitative bound on the error incurred in the clustering of disjoint connected sets.
arXiv Detail & Related papers (2021-09-20T14:45:29Z) - On Supervised Classification of Feature Vectors with Independent and
Non-Identically Distributed Elements [10.52087851034255]
We investigate the problem of classifying feature vectors with mutually independent but non-identically distributed elements.
We show that the error probability goes to zero as the length of the feature vectors grows, even when there is only one training feature vector per label available.
arXiv Detail & Related papers (2020-08-01T06:49:50Z) - Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function.
We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z) - High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance
Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized.
Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Intrinsic Dimension Estimation via Nearest Constrained Subspace
Classifier [7.028302194243312]
A new subspace based classifier is proposed for supervised classification or intrinsic dimension estimation.
The distribution of the data in each class is modeled by a union of a finite number ofaffine subspaces of the feature space.
The proposed method is a generalisation of classical NN (Nearest Neighbor), NFL (Nearest Feature Line) and has a close relationship to NS (Nearest Subspace)
The proposed classifier with an accurately estimated dimension parameter generally outperforms its competitors in terms of classification accuracy.
arXiv Detail & Related papers (2020-02-08T20:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.