On Supervised Classification of Feature Vectors with Independent and
Non-Identically Distributed Elements
- URL: http://arxiv.org/abs/2008.00190v2
- Date: Tue, 30 Mar 2021 00:58:05 GMT
- Title: On Supervised Classification of Feature Vectors with Independent and
Non-Identically Distributed Elements
- Authors: Farzad Shahrivari and Nikola Zlatanov
- Abstract summary: We investigate the problem of classifying feature vectors with mutually independent but non-identically distributed elements.
We show that the error probability goes to zero as the length of the feature vectors grows, even when there is only one training feature vector per label available.
- Score: 10.52087851034255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the problem of classifying feature vectors with
mutually independent but non-identically distributed elements. First, we show
the importance of this problem. Next, we propose a classifier and derive an
analytical upper bound on its error probability. We show that the error
probability goes to zero as the length of the feature vectors grows, even when
there is only one training feature vector per label available. Thereby, we show
that for this important problem at least one asymptotically optimal classifier
exists. Finally, we provide numerical examples where we show that the
performance of the proposed classifier outperforms conventional classification
algorithms when the number of training data is small and the length of the
feature vectors is sufficiently high.
Related papers
- An Upper Bound for the Distribution Overlap Index and Its Applications [18.481370450591317]
This paper proposes an easy-to-compute upper bound for the overlap index between two probability distributions.
The proposed bound shows its value in one-class classification and domain shift analysis.
Our work shows significant promise toward broadening the applications of overlap-based metrics.
arXiv Detail & Related papers (2022-12-16T20:02:03Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - Exploring Category-correlated Feature for Few-shot Image Classification [27.13708881431794]
We present a simple yet effective feature rectification method by exploring the category correlation between novel and base classes as the prior knowledge.
The proposed approach consistently obtains considerable performance gains on three widely used benchmarks.
arXiv Detail & Related papers (2021-12-14T08:25:24Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Approximation and generalization properties of the random projection classification method [0.4604003661048266]
We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature.
For certain classification problems (e.g., those with a large Rashomon ratio), there is a potntially large gain in generalization properties by selecting parameters at random.
arXiv Detail & Related papers (2021-08-11T23:14:46Z) - Minimax Estimation of Linear Functions of Eigenvectors in the Face of
Small Eigen-Gaps [95.62172085878132]
Eigenvector perturbation analysis plays a vital role in various statistical data science applications.
We develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector.
In order to mitigate a non-negligible bias issue inherent to the natural "plug-in" estimator, we develop de-biased estimators.
arXiv Detail & Related papers (2021-04-07T17:55:10Z) - Counterfactual Explanations for Oblique Decision Trees: Exact, Efficient
Algorithms [0.0]
We consider counterfactual explanations, the problem of minimally adjusting features in a source input instance so that it is classified as a target class under a given classification.
This has become a topic of recent interest as a way to query a trained model and suggest possible actions to overturn its decision.
arXiv Detail & Related papers (2021-03-01T16:04:33Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Semi-supervised Disentanglement with Independent Vector Variational
Autoencoders [7.700240949386079]
We separate generative factors of data into two latent vectors in a variational autoencoder.
To learn the discrete class features, we introduce supervision using a small amount of labeled data.
We show that (i) this vector independence term exists within the result obtained on the evidence decomposing lower bound with multiple latent vectors, and (ii) encouraging such independence along with reducing the total correlation within the vectors enhances disentanglement performance.
arXiv Detail & Related papers (2020-03-14T09:20:22Z) - Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.