Linear Discriminant Analysis with the Randomized Kaczmarz Method
- URL: http://arxiv.org/abs/2211.05749v2
- Date: Tue, 07 Jan 2025 19:01:50 GMT
- Title: Linear Discriminant Analysis with the Randomized Kaczmarz Method
- Authors: Jocelyn T. Chi, Deanna Needell,
- Abstract summary: We present an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data.
Our experiments demonstrate that rkLDA can offer a viable alternative to full data LDA on a range of step-sizes and numbers of iterations.
- Score: 8.020732438595905
- License:
- Abstract: We present a randomized Kaczmarz method for linear discriminant analysis (rkLDA), an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data. We harness a least squares formulation and mobilize the stochastic gradient descent framework to obtain a randomized classifier with performance that can achieve comparable accuracy to that of full data LDA. We present analysis for the expected change in the LDA discriminant function if one employs the randomized Kaczmarz solution in lieu of the full data least squares solution that accounts for both the Gaussian modeling assumptions on the data and algorithmic randomness. Our analysis shows how the expected change depends on quantities inherent in the data such as the scaled condition number and Frobenius norm of the input data, how well the linear model fits the data, and choices from the randomized algorithm. Our experiments demonstrate that rkLDA can offer a viable alternative to full data LDA on a range of step-sizes and numbers of iterations.
Related papers
- Minimally Informed Linear Discriminant Analysis: training an LDA model
with unlabelled data [51.673443581397954]
We show that it is possible to compute the exact projection vector from LDA models based on unlabelled data.
We show that the MILDA projection vector can be computed in a closed form with a computational cost comparable to LDA.
arXiv Detail & Related papers (2023-10-17T09:50:31Z) - FEMDA: Une m\'ethode de classification robuste et flexible [0.8594140167290096]
This paper studies robustness to scale changes in the data of a new discriminant analysis technique.
The new decision rule derived is simple, fast, and robust to scale changes in the data compared to other state-of-the-art method.
arXiv Detail & Related papers (2023-07-04T23:15:31Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Spectrally-Corrected and Regularized Linear Discriminant Analysis for
Spiked Covariance Model [2.517838307493912]
This paper proposes an improved linear discriminant analysis called spectrally-corrected and regularized LDA (SRLDA)
It is proved that SRLDA has a linear classification global optimal solution under the spiked model assumption.
Experiments on different data sets show that the SRLDA algorithm performs better in classification and dimensionality reduction than currently used tools.
arXiv Detail & Related papers (2022-10-08T00:47:50Z) - Varying Coefficient Linear Discriminant Analysis for Dynamic Data [5.228711636020666]
This paper investigates the varying coefficient LDA model for dynamic data.
By deriving a new discriminant direction function parallel with Bayes' direction, we propose a least-square estimation procedure.
For high-dimensional regime, the corresponding data-driven discriminant rule is more computationally efficient than the existed dynamic linear programming rule.
arXiv Detail & Related papers (2022-03-12T07:32:19Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z) - A Doubly Regularized Linear Discriminant Analysis Classifier with
Automatic Parameter Selection [24.027886914804775]
Linear discriminant analysis (LDA) based classifiers tend to falter in many practical settings where the training data size is smaller than, or comparable to, the number of features.
We propose a doubly regularized LDA classifier that we denote as R2LDA.
Results obtained from both synthetic and real data demonstrate the consistency and effectiveness of the proposed R2LDA approach.
arXiv Detail & Related papers (2020-04-28T07:09:22Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Saliency-based Weighted Multi-label Linear Discriminant Analysis [101.12909759844946]
We propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks.
The proposed method is based on a probabilistic model for defining the weights of individual samples.
The Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
arXiv Detail & Related papers (2020-04-08T19:40:53Z) - Semi-analytic approximate stability selection for correlated data in
generalized linear models [3.42658286826597]
We propose a novel approximate inference algorithm that can conduct Stability Selection without the repeated fitting.
The algorithm is based on the replica method of statistical mechanics and vector approximate message passing of information theory.
Numerical experiments indicate that the algorithm exhibits fast convergence and high approximation accuracy for both synthetic and real-world data.
arXiv Detail & Related papers (2020-03-19T10:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.