Supervised Quadratic Feature Analysis: Information Geometry Approach for Dimensionality Reduction
- URL: http://arxiv.org/abs/2502.00168v3
- Date: Fri, 30 May 2025 16:27:22 GMT
- Title: Supervised Quadratic Feature Analysis: Information Geometry Approach for Dimensionality Reduction
- Authors: Daniel Herrera-Esposito, Johannes Burge,
- Abstract summary: Supervised dimensionality reduction aims to map labeled data to a low-dimensional feature space while maximizing class discriminability.<n>We present Supervised Quadratic Feature Analysis (SQFA), a linear dimensionality reduction method that maximizes Fisher-Rao distances between class distributions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised dimensionality reduction aims to map labeled data to a low-dimensional feature space while maximizing class discriminability. Directly computing discriminability is often impractical, so an alternative approach is to learn features that maximize a distance or dissimilarity measure between classes. The Fisher-Rao distance is an important information geometry distance in statistical manifolds. It is induced by the Fisher information metric, a tool widely used for understanding neural representations. Despite its theoretical and pratical appeal, Fisher-Rao distances between classes have not been used as a maximization objective in supervised feature learning. Here, we present Supervised Quadratic Feature Analysis (SQFA), a linear dimensionality reduction method that maximizes Fisher-Rao distances between class distributions, by exploiting the information geometry of the symmetric positive definite manifold. SQFA maximizes distances using first- and second-order statistics, and its features allow for quadratic discriminability (i.e. QDA performance) matching or surpassing state-of-the-art methods on real-world datasets. We theoretically motivate Fisher-Rao distances as a proxy for quadratic discriminability, and compare its performance to other popular distances (e.g. Wasserstein distances). SQFA provides a flexible state-of-the-art method for dimensionality reduction. Its successful use of Fisher-Rao distances between classes motivates future research directions.
Related papers
- Geometry-aware Distance Measure for Diverse Hierarchical Structures in Hyperbolic Spaces [48.948334221681684]
We propose a geometry-aware distance measure in hyperbolic spaces, which dynamically adapts to varying hierarchical structures.<n>Our approach consistently outperforms learning methods that use fixed distance measures.<n>Visualization shows clearer class boundaries and improved prototype separation in hyperbolic spaces.
arXiv Detail & Related papers (2025-06-23T11:43:39Z) - A Convex formulation for linear discriminant analysis [1.3124513975412255]
We present a supervised dimensionality reduction technique called Convex Linear Discriminant Analysis (ConvexLDA)
We show that ConvexLDA outperforms several popular linear discriminant analysis (LDA)-based methods on a range of high-dimensional biological data, image data sets, etc.
arXiv Detail & Related papers (2025-03-17T18:17:49Z) - Approximation and bounding techniques for the Fisher-Rao distances between parametric statistical models [7.070726553564701]
We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances.<n>In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation.<n>We propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics.
arXiv Detail & Related papers (2024-03-15T08:05:16Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Enhancing Person Re-Identification through Tensor Feature Fusion [0.5562294018150907]
We present a novel person reidentification (PRe-ID) system that based on tensor feature representation and multilinear subspace learning.
Our approach utilizes pretrained CNNs for high-level feature extraction.
Cross-View Quadratic Discriminant Analysis (TXQDA) algorithm is used for multilinear subspace learning.
arXiv Detail & Related papers (2023-12-16T15:04:07Z) - Approximation Theory, Computing, and Deep Learning on the Wasserstein Space [0.5735035463793009]
We address the challenge of approximating functions in infinite-dimensional spaces from finite samples.
Our focus is on the Wasserstein distance function, which serves as a relevant example.
We adopt three machine learning-based approaches to define functional approximants.
arXiv Detail & Related papers (2023-10-30T13:59:47Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Interpretable Linear Dimensionality Reduction based on Bias-Variance
Analysis [45.3190496371625]
We propose a principled dimensionality reduction approach that maintains the interpretability of the resulting features.
In this way, all features are considered, the dimensionality is reduced and the interpretability is preserved.
arXiv Detail & Related papers (2023-03-26T14:30:38Z) - Constructing Balance from Imbalance for Long-tailed Image Recognition [50.6210415377178]
The imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural networks.
Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design.
We propose a concise paradigm by progressively adjusting label space and dividing the head classes and tail classes.
Our proposed model also provides a feature evaluation method and paves the way for long-tailed feature learning.
arXiv Detail & Related papers (2022-08-04T10:22:24Z) - Functional Nonlinear Learning [0.0]
We propose a functional nonlinear learning (FunNoL) method to represent multivariate functional data in a lower-dimensional feature space.
We show that FunNoL provides satisfactory curve classification and reconstruction regardless of data sparsity.
arXiv Detail & Related papers (2022-06-22T23:47:45Z) - On Hypothesis Transfer Learning of Functional Linear Models [8.557392136621894]
We study the transfer learning (TL) for the functional linear regression (FLR) under the Reproducing Kernel Space (RKHS) framework.
We measure the similarity across tasks using RKHS distance, allowing the type of information being transferred tied to the properties of the imposed RKHS.
Two algorithms are proposed: one conducts the transfer when positive sources are known, while the other leverages aggregation to achieve robust transfer without prior information about the sources.
arXiv Detail & Related papers (2022-06-09T04:50:16Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Learning Linearized Assignment Flows for Image Labeling [70.540936204654]
We introduce a novel algorithm for estimating optimal parameters of linearized assignment flows for image labeling.
We show how to efficiently evaluate this formula using a Krylov subspace and a low-rank approximation.
arXiv Detail & Related papers (2021-08-02T13:38:09Z) - Sparse Universum Quadratic Surface Support Vector Machine Models for
Binary Classification [0.0]
We design kernel-free Universum quadratic surface support vector machine models.
We propose the L1 norm regularized version that is beneficial for detecting potential sparsity patterns in the Hessian of the quadratic surface.
We conduct numerical experiments on both artificial and public benchmark data sets to demonstrate the feasibility and effectiveness of the proposed models.
arXiv Detail & Related papers (2021-04-03T07:40:30Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Robust Similarity and Distance Learning via Decision Forests [8.587164648430251]
We propose a novel decision forest algorithm for the task of distance learning, which we call Similarity and Metric Random Forests (SMERF)
Its ability to approximate arbitrary distances and identify important features is empirically demonstrated on simulated data sets.
arXiv Detail & Related papers (2020-07-27T20:17:42Z) - Rethink Maximum Mean Discrepancy for Domain Adaptation [77.2560592127872]
This paper theoretically proves two essential facts: 1) minimizing the Maximum Mean Discrepancy equals to maximize the source and target intra-class distances respectively but jointly minimize their variance with some implicit weights, so that the feature discriminability degrades.
Experiments on several benchmark datasets not only prove the validity of theoretical results but also demonstrate that our approach could perform better than the comparative state-of-art methods substantially.
arXiv Detail & Related papers (2020-07-01T18:25:10Z) - High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance
Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized.
Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Saliency-based Weighted Multi-label Linear Discriminant Analysis [101.12909759844946]
We propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks.
The proposed method is based on a probabilistic model for defining the weights of individual samples.
The Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
arXiv Detail & Related papers (2020-04-08T19:40:53Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.