From explained variance of correlated components to PCA without
orthogonality constraints
- URL: http://arxiv.org/abs/2402.04692v1
- Date: Wed, 7 Feb 2024 09:32:32 GMT
- Title: From explained variance of correlated components to PCA without
orthogonality constraints
- Authors: Marie Chavent (IMB), Guy Chavent
- Abstract summary: Block Principal Component Analysis (Block PCA) of a data matrix A is difficult to use for the design of sparse PCA by 1 regularization.
We introduce new objective matrix functions expvar(Y) which measure the part of the variance of the data matrix A explained by correlated components Y = AZ.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Block Principal Component Analysis (Block PCA) of a data matrix A, where
loadings Z are determined by maximization of AZ 2 over unit norm orthogonal
loadings, is difficult to use for the design of sparse PCA by 1 regularization,
due to the difficulty of taking care of both the orthogonality constraint on
loadings and the non differentiable 1 penalty. Our objective in this paper is
to relax the orthogonality constraint on loadings by introducing new objective
functions expvar(Y) which measure the part of the variance of the data matrix A
explained by correlated components Y = AZ. So we propose first a comprehensive
study of mathematical and numerical properties of expvar(Y) for two existing
definitions Zou et al. [2006], Shen and Huang [2008] and four new definitions.
Then we show that only two of these explained variance are fit to use as
objective function in block PCA formulations for A rid of orthogonality
constraints.
Related papers
- $σ$-PCA: a building block for neural learning of identifiable linear transformations [0.0]
$sigma$-PCA is a method that formulates a unified model for linear and nonlinear PCA.
nonlinear PCA can be seen as a method that maximizes both variance and statistical independence.
arXiv Detail & Related papers (2023-11-22T18:34:49Z) - Two derivations of Principal Component Analysis on datasets of
distributions [15.635370717421017]
We formulate Principal Component Analysis (PCA) over datasets consisting not of points but of distributions.
Just like the usual PCA on points can be equivalently derived via a variance-maximization principle and via a minimization of reconstruction error, we derive a closed-form solution for distributional PCA.
arXiv Detail & Related papers (2023-06-23T14:00:14Z) - Sparse PCA With Multiple Components [2.5382095320488673]
Sparse Principal Component Analysis (sPCA) is a technique for obtaining combinations of features that explain variance of high-dimensional datasets in an interpretable manner.
Most existing PCA methods do not guarantee the optimality, let alone the optimality, of the resulting solution when we seek multiple sparse PCs.
We propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets.
arXiv Detail & Related papers (2022-09-29T13:57:18Z) - Equivariant Disentangled Transformation for Domain Generalization under
Combination Shift [91.38796390449504]
Combinations of domains and labels are not observed during training but appear in the test environment.
We provide a unique formulation of the combination shift problem based on the concepts of homomorphism, equivariance, and a refined definition of disentanglement.
arXiv Detail & Related papers (2022-08-03T12:31:31Z) - Derivation of Learning Rules for Coupled Principal Component Analysis in
a Lagrange-Newton Framework [0.0]
We describe a Lagrange-Newton framework for the derivation of learning rules with desirable convergence properties.
A Newton descent is applied to an extended variable vector which also includes Lagrange multipliers introduced with constraints.
The framework produces "coupled" PCA learning rules which simultaneously estimate an eigenvector and the corresponding eigenvalue in cross-coupled differential equations.
arXiv Detail & Related papers (2022-04-28T12:50:11Z) - AgFlow: Fast Model Selection of Penalized PCA via Implicit
Regularization Effects of Gradient Flow [64.81110234990888]
Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction.
In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings.
We propose Approximated Gradient Flow (AgFlow) as a fast model selection method for penalized PCA.
arXiv Detail & Related papers (2021-10-07T08:57:46Z) - GroupifyVAE: from Group-based Definition to VAE-based Unsupervised
Representation Disentanglement [91.9003001845855]
VAE-based unsupervised disentanglement can not be achieved without introducing other inductive bias.
We address VAE-based unsupervised disentanglement by leveraging the constraints derived from the Group Theory based definition as the non-probabilistic inductive bias.
We train 1800 models covering the most prominent VAE-based models on five datasets to verify the effectiveness of our method.
arXiv Detail & Related papers (2021-02-20T09:49:51Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Exponentially Weighted l_2 Regularization Strategy in Constructing
Reinforced Second-order Fuzzy Rule-based Model [72.57056258027336]
In the conventional Takagi-Sugeno-Kang (TSK)-type fuzzy models, constant or linear functions are usually utilized as the consequent parts of the fuzzy rules.
We introduce an exponential weight approach inspired by the weight function theory encountered in harmonic analysis.
arXiv Detail & Related papers (2020-07-02T15:42:15Z) - Repulsive Mixture Models of Exponential Family PCA for Clustering [127.90219303669006]
The mixture extension of exponential family principal component analysis ( EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA.
The traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering.
In this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework.
arXiv Detail & Related papers (2020-04-07T04:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.