Related papers: Optimality and complexity of classification by random projection

Optimality and complexity of classification by random projection

URL: http://arxiv.org/abs/2108.06339v3
Date: Thu, 18 May 2023 15:51:02 GMT
Title: Optimality and complexity of classification by random projection
Authors: Mireille Boutin, Evzenie Coupkova
Abstract summary: The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We show that this type of classifier is extremely flexible, as it is likely to approximate an arbitrary precision. In particular, given full knowledge of the class conditional densities, the error of these low-complexity classifiers would converge to the optimal (Bayes) error as k and n go to infinity.
Score: 1.5229257192293197
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature. The feature is obtained by projecting the data on a random line after embedding it into a higher-dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n, based on its performance on training data, is chosen. We show that this type of classifier is extremely flexible, as it is likely to approximate, to an arbitrary precision, any continuous function on a compact set as well as any boolean function on a compact set that splits the support into measurable subsets. In particular, given full knowledge of the class conditional densities, the error of these low-complexity classifiers would converge to the optimal (Bayes) error as k and n go to infinity. On the other hand, if only a training dataset is given, we show that the classifiers will perfectly classify all the training points as k and n go to infinity. We also bound the generalization error of our random classifiers. In general, our bounds are better than those for any classifier with VC dimension greater than O (ln n) . In particular, our bounds imply that, unless the number of projections n is extremely large, there is a significant advantageous gap between the generalization error of the random projection approach and that of a linear classifier in the extended space. Asymptotically, as the number of samples approaches infinity, the gap persists for any such n. Thus, there is a potentially large gain in generalization properties by selecting parameters at random, rather than optimization.

Related papers

Gentle Local Robustness implies Generalization [1.2630732866686982]
We present a class of novel bounds, which are model-dependent and provably tighter than the existing robustness-based ones. Unlike prior ones, our bounds are guaranteed to converge to the true error of the best classifier, as the number of samples increases. We further provide an extensive experiment and find that two of our bounds are often non-vacuous for a large class of deep neural networks, pretrained from ImageNet.
arXiv Detail & Related papers (2024-12-09T10:59:39Z)
How many classifiers do we need? [50.69951049206484]
We provide a detailed analysis of how the disagreement and the polarization among classifiers relate to the performance gain achieved by aggregating individual classifiers. We prove results for the behavior of the disagreement in terms of the number of classifiers. Our theories and claims are supported by empirical results on several image classification tasks with various types of neural networks.
arXiv Detail & Related papers (2024-11-01T02:59:56Z)
Classification Using Global and Local Mahalanobis Distances [1.7811840395202345]
We propose a novel semiparametric classifier based on Mahalanobis distances of an observation from the competing classes. Our tool is a generalized additive model with the logistic link function that uses these distances as features to estimate the posterior probabilities of different classes.
arXiv Detail & Related papers (2024-02-13T08:22:42Z)
Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models [4.093769373833101]
We resolve the conjecture posed in Subramanian et al.'22, where the number of data points, features, and classes all grow together. Our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1ally. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels.
arXiv Detail & Related papers (2023-06-23T00:59:15Z)
GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models [74.0430727476634]
We propose a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature,class) With a variety of segmentation architectures and backbones, GMMSeg outperforms the discriminative counterparts on closed-set datasets. GMMSeg even performs well on open-world datasets.
arXiv Detail & Related papers (2022-10-05T05:20:49Z)
Soft-margin classification of object manifolds [0.0]
A neural population responding to multiple appearances of a single object defines a manifold in the neural response space. The ability to classify such manifold is of interest, as object recognition and other computational tasks require a response that is insensitive to variability within a manifold. Soft-margin classifiers are a larger class of algorithms and provide an additional regularization parameter used in applications to optimize performance outside the training set.
arXiv Detail & Related papers (2022-03-14T12:23:36Z)
Local versions of sum-of-norms clustering [77.34726150561087]
We show that our method can separate arbitrarily close balls in the ball model. We prove a quantitative bound on the error incurred in the clustering of disjoint connected sets.
arXiv Detail & Related papers (2021-09-20T14:45:29Z)
On Supervised Classification of Feature Vectors with Independent and Non-Identically Distributed Elements [10.52087851034255]
We investigate the problem of classifying feature vectors with mutually independent but non-identically distributed elements. We show that the error probability goes to zero as the length of the feature vectors grows, even when there is only one training feature vector per label available.
arXiv Detail & Related papers (2020-08-01T06:49:50Z)
Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function. We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z)
High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized. Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z)
Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets. We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator. We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
Intrinsic Dimension Estimation via Nearest Constrained Subspace Classifier [7.028302194243312]
A new subspace based classifier is proposed for supervised classification or intrinsic dimension estimation. The distribution of the data in each class is modeled by a union of a finite number ofaffine subspaces of the feature space. The proposed method is a generalisation of classical NN (Nearest Neighbor), NFL (Nearest Feature Line) and has a close relationship to NS (Nearest Subspace) The proposed classifier with an accurately estimated dimension parameter generally outperforms its competitors in terms of classification accuracy.
arXiv Detail & Related papers (2020-02-08T20:54:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.