Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants
- URL: http://arxiv.org/abs/2004.08217v1
- Date: Fri, 17 Apr 2020 12:47:04 GMT
- Title: Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants
- Authors: Lama B. Niyazi, Abla Kammoun, Hayssam Dahrouj, Mohamed-Slim Alouini,
and Tareq Y. Al-Naffouri
- Abstract summary: In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
- Score: 94.46276668068327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Datasets from the fields of bioinformatics, chemometrics, and face
recognition are typically characterized by small samples of high-dimensional
data. Among the many variants of linear discriminant analysis that have been
proposed in order to rectify the issues associated with classification in such
a setting, the classifier in [1], composed of an ensemble of randomly projected
linear discriminants, seems especially promising; it is computationally
efficient and, with the optimal projection dimension parameter setting, is
competitive with the state-of-the-art. In this work, we seek to further
understand the behavior of this classifier through asymptotic analysis. Under
the assumption of a growth regime in which the dataset and projection
dimensions grow at constant rates to each other, we use random matrix theory to
derive asymptotic misclassification probabilities showing the effect of the
ensemble as a regularization of the data sample covariance matrix. The
asymptotic errors further help to identify situations in which the ensemble
offers a performance advantage. We also develop a consistent estimator of the
misclassification probability as an alternative to the computationally-costly
cross-validation estimator, which is conventionally used for parameter tuning.
Finally, we demonstrate the use of our estimator for tuning the projection
dimension on both real and synthetic data.
Related papers
- Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Regularized Projection Matrix Approximation with Applications to Community Detection [1.3761665705201904]
This paper introduces a regularized projection matrix approximation framework designed to recover cluster information from the affinity matrix.
We investigate three distinct penalty functions, each specifically tailored to address bounded, positive, and sparse scenarios.
Numerical experiments conducted on both synthetic and real-world datasets reveal that our regularized projection matrix approximation approach significantly outperforms state-of-the-art methods in clustering performance.
arXiv Detail & Related papers (2024-05-26T15:18:22Z) - Uncertainty quantification in metric spaces [3.637162892228131]
This paper introduces a novel uncertainty quantification framework for regression models where the response takes values in a separable metric space.
The proposed algorithms can efficiently handle large datasets and are agnostic to the predictive base model used.
arXiv Detail & Related papers (2024-05-08T15:06:02Z) - Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing [28.91482208876914]
We consider the problem of parameter estimation in a high-dimensional generalized linear model.
Despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured designs.
arXiv Detail & Related papers (2023-08-28T11:49:23Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Weight Vector Tuning and Asymptotic Analysis of Binary Linear
Classifiers [82.5915112474988]
This paper proposes weight vector tuning of a generic binary linear classifier through the parameterization of a decomposition of the discriminant by a scalar.
It is also found that weight vector tuning significantly improves the performance of Linear Discriminant Analysis (LDA) under high estimation noise.
arXiv Detail & Related papers (2021-10-01T17:50:46Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.