Beyond PCA: A Probabilistic Gram-Schmidt Approach to Feature Extraction
- URL: http://arxiv.org/abs/2311.09386v2
- Date: Tue, 6 Feb 2024 03:42:12 GMT
- Title: Beyond PCA: A Probabilistic Gram-Schmidt Approach to Feature Extraction
- Authors: Bahram Yaghooti, Netanel Raviv, Bruno Sinopoli
- Abstract summary: Linear feature extraction at the presence of nonlinear dependencies among the data is a fundamental challenge in unsupervised learning.
We propose using a probabilistic Gram-Schmidt type orthogonalization process in order to detect and map out redundant dimensions.
We provide simulation results on synthetic and real-world datasets which show improved performance over PCA and state-of-the-art linear feature extraction algorithms.
- Score: 8.287206589886878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Linear feature extraction at the presence of nonlinear dependencies among the
data is a fundamental challenge in unsupervised learning. We propose using a
probabilistic Gram-Schmidt (GS) type orthogonalization process in order to
detect and map out redundant dimensions. Specifically, by applying the GS
process over a family of functions which presumably captures the nonlinear
dependencies in the data, we construct a series of covariance matrices that can
either be used to identify new large-variance directions, or to remove those
dependencies from the principal components. In the former case, we provide
information-theoretic guarantees in terms of entropy reduction. In the latter,
we prove that under certain assumptions the resulting algorithms detect and
remove nonlinear dependencies whenever those dependencies lie in the linear
span of the chosen function family. Both proposed methods extract linear
features from the data while removing nonlinear redundancies. We provide
simulation results on synthetic and real-world datasets which show improved
performance over PCA and state-of-the-art linear feature extraction algorithms,
both in terms of variance maximization of the extracted features, and in terms
of improved performance of classification algorithms. Additionally, our methods
are comparable and often outperform the non-linear method of kernel PCA.
Related papers
- Kernel Alignment for Unsupervised Feature Selection via Matrix Factorization [8.020732438595905]
unsupervised feature selection has been proven effective in alleviating the so-called curse of dimensionality.
Most existing matrix factorization-based unsupervised feature selection methods are built upon subspace learning.
In this paper, we construct a model by integrating kernel functions and kernel alignment, which can be equivalently characterized as a matrix factorization problem.
By doing so, our model can learn both linear and nonlinear similarity information and automatically generate the most appropriate kernel.
arXiv Detail & Related papers (2024-03-13T20:35:44Z) - Pessimistic Nonlinear Least-Squares Value Iteration for Offline
Reinforcement Learning [58.962016644796]
We propose an oracle-efficient algorithm, dubbed Pessimistic Least-Square Value Iteration (PNLSVI) for offline RL with non-linear function approximation.
Our algorithm enjoys a regret bound that has a tight dependency on the function class complexity and achieves minimax optimal instance-dependent regret when specialized to linear function approximation.
arXiv Detail & Related papers (2023-10-02T17:42:01Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Functional Nonlinear Learning [0.0]
We propose a functional nonlinear learning (FunNoL) method to represent multivariate functional data in a lower-dimensional feature space.
We show that FunNoL provides satisfactory curve classification and reconstruction regardless of data sparsity.
arXiv Detail & Related papers (2022-06-22T23:47:45Z) - Decoupling multivariate functions using a nonparametric filtered tensor
decomposition [0.29360071145551075]
Decoupling techniques aim at providing an alternative representation of the nonlinearity.
The so-called decoupled form is often a more efficient parameterisation of the relationship while being highly structured, favouring interpretability.
In this work two new algorithms, based on filtered tensor decompositions of first order derivative information are introduced.
arXiv Detail & Related papers (2022-05-23T09:34:17Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Sparse Quantized Spectral Clustering [85.77233010209368]
We exploit tools from random matrix theory to make precise statements about how the eigenspectrum of a matrix changes under such nonlinear transformations.
We show that very little change occurs in the informative eigenstructure even under drastic sparsification/quantization.
arXiv Detail & Related papers (2020-10-03T15:58:07Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.