Related papers: A Convex formulation for linear discriminant analysis

A Convex formulation for linear discriminant analysis

URL: http://arxiv.org/abs/2503.13623v1
Date: Mon, 17 Mar 2025 18:17:49 GMT
Title: A Convex formulation for linear discriminant analysis
Authors: Sai Vijay Kumar Surineela, Prathyusha Kanakamalla, Harigovind Harikumar, Tomojit Ghosh,
Abstract summary: We present a supervised dimensionality reduction technique called Convex Linear Discriminant Analysis (ConvexLDA)<n>We show that ConvexLDA outperforms several popular linear discriminant analysis (LDA)-based methods on a range of high-dimensional biological data, image data sets, etc.
Score: 1.3124513975412255
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a supervised dimensionality reduction technique called Convex Linear Discriminant Analysis (ConvexLDA). The proposed model optimizes a multi-objective cost function by balancing two complementary terms. The first term pulls the samples of a class towards its centroid by minimizing a sample's distance from its class-centroid in low dimensional space. The second term pushes the classes far apart by maximizing their hyperellipsoid scattering volume via the logarithm of the determinant (\textit{log det}) of the outer product matrix formed by the low-dimensional class-centroids. Using the negative of the \textit{log det}, we pose the final cost as a minimization problem, which balances the two terms using a hyper-parameter $\lambda$. We demonstrate that the cost function is convex. Unlike Fisher LDA, the proposed method doesn't require to compute the inverse of a matrix, hence avoiding any ill-conditioned problem where data dimension is very high, e.g. RNA-seq data. ConvexLDA doesn't require pair-wise distance calculation, making it faster and more easily scalable. Moreover, the convex nature of the cost function ensures global optimality, enhancing the reliability of the learned embedding. Our experimental evaluation demonstrates that ConvexLDA outperforms several popular linear discriminant analysis (LDA)-based methods on a range of high-dimensional biological data, image data sets, etc.

Related papers

ADMM for Structured Fractional Minimization [7.9047096855236125]
We consider a class of structured fractional problems, where the numerator includes a differentiable function. We introduce sf FADMM, the first Alternating Method of Multipliers for this class of problems.
arXiv Detail & Related papers (2024-11-12T02:50:12Z)
Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives. Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion. We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z)
Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal. Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z)
Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
When the non problem is by which the non problem is by whichity, the sample of first-order methods may depend linearly on the problem dimension, is for undesirable problems. Our algorithms allow for the estimate of complexity using the distance of. mathO (log d) / EuM4. We prove that DISFOM can sharpen variance employing $mathO (log d) / EuM4.
arXiv Detail & Related papers (2024-06-27T18:38:42Z)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient [40.22217106270146]
Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. In composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random estimation. This paper proposes the Zeroth-order Proximal Double Variance Reduction (ZPDVR) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances.
arXiv Detail & Related papers (2024-05-28T02:27:53Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Byzantine-resilient Federated Learning With Adaptivity to Data Heterogeneity [54.145730036889496]
This paper deals with Gradient learning (FL) in the presence of malicious attacks Byzantine data. A novel Average Algorithm (RAGA) is proposed, which leverages robustness aggregation and can select a dataset.
arXiv Detail & Related papers (2024-03-20T08:15:08Z)
Minimally Informed Linear Discriminant Analysis: training an LDA model with unlabelled data [51.673443581397954]
We show that it is possible to compute the exact projection vector from LDA models based on unlabelled data. We show that the MILDA projection vector can be computed in a closed form with a computational cost comparable to LDA.
arXiv Detail & Related papers (2023-10-17T09:50:31Z)
Sparse Linear Centroid-Encoder: A Convex Method for Feature Selection [1.057079240576682]
We present Sparse Centroid-Encoder (SLCE) over a novel feature selection technique. The algorithm uses a linear network to reconstruct a neural feature at the same time.
arXiv Detail & Related papers (2023-06-07T23:07:55Z)
A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis [3.4806267677524896]
Wasserstein discriminant analysis (WDA) is a linear dimensionality reduction method. WDA can account for both global and local interconnections between data classes. A bi-level nonlinear eigenvector algorithm (WDA-nepv) is presented.
arXiv Detail & Related papers (2022-11-21T22:40:43Z)
Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates [0.0]
We show that the first optimality gap of the proposed algorithm decays at the rate of the expected loss for various methods under nontens dependent data setting. We obtain first convergence point for various methods under nontens dependent data setting.
arXiv Detail & Related papers (2022-01-05T15:17:35Z)
Two-dimensional Bhattacharyya bound linear discriminant analysis with its applications [8.392689203476955]
Recently proposed L2-norm linear discriminant analysis criterion via the Bhattacharyya error bound estimation (L2BLDA) is an effective improvement of linear discriminant analysis (LDA) for feature extraction. We extend L2BLDA to a two-dimensional Bhattacharyya bound linear discriminant analysis (2DBLDA) The construction of 2DBLDA makes it avoid the small sample size problem while also possess robustness, and can be solved through a simple standard eigenvalue decomposition problem.
arXiv Detail & Related papers (2020-11-11T01:56:42Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.