Optimal vintage factor analysis with deflation varimax
- URL: http://arxiv.org/abs/2310.10545v1
- Date: Mon, 16 Oct 2023 16:14:43 GMT
- Title: Optimal vintage factor analysis with deflation varimax
- Authors: Xin Bing, Dian Jin and Yuqian Zhang
- Abstract summary: Vintage factor analysis aims to first find a low-dimensional representation of the original data, and then to seek a such that the rotated low-dimensional representation is scientifically meaningful.
Perhaps most widely used vintage factor analysis is Principal Component Analysis (PCA) followed by varimax representation.
In this paper, we propose a deflation-to-optimization procedure that solves each row matrix sequentially.
- Score: 18.50195604586597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vintage factor analysis is one important type of factor analysis that aims to
first find a low-dimensional representation of the original data, and then to
seek a rotation such that the rotated low-dimensional representation is
scientifically meaningful. Perhaps the most widely used vintage factor analysis
is the Principal Component Analysis (PCA) followed by the varimax rotation.
Despite its popularity, little theoretical guarantee can be provided mainly
because varimax rotation requires to solve a non-convex optimization over the
set of orthogonal matrices.
In this paper, we propose a deflation varimax procedure that solves each row
of an orthogonal matrix sequentially. In addition to its net computational gain
and flexibility, we are able to fully establish theoretical guarantees for the
proposed procedure in a broad context.
Adopting this new varimax approach as the second step after PCA, we further
analyze this two step procedure under a general class of factor models. Our
results show that it estimates the factor loading matrix in the optimal rate
when the signal-to-noise-ratio (SNR) is moderate or large. In the low SNR
regime, we offer possible improvement over using PCA and the deflation
procedure when the additive noise under the factor model is structured. The
modified procedure is shown to be optimal in all SNR regimes. Our theory is
valid for finite sample and allows the number of the latent factors to grow
with the sample size as well as the ambient dimension to grow with, or even
exceed, the sample size.
Extensive simulation and real data analysis further corroborate our
theoretical findings.
Related papers
- Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Regularization and Variance-Weighted Regression Achieves Minimax
Optimality in Linear MDPs: Theory and Practice [79.48432795639403]
Mirror descent value iteration (MDVI) is an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL)
We study MDVI with linear function approximation through its sample complexity required to identify an $varepsilon$-optimal policy.
We present Variance-Weighted Least-Squares MDVI, the first theoretical algorithm that achieves nearly minimax optimal sample complexity for infinite-horizon linear MDPs.
arXiv Detail & Related papers (2023-05-22T16:13:05Z) - Optimal Discriminant Analysis in High-Dimensional Latent Factor Models [1.4213973379473654]
In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space.
We formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure.
We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections.
arXiv Detail & Related papers (2022-10-23T21:45:53Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Generative Principal Component Analysis [47.03792476688768]
We study the problem of principal component analysis with generative modeling assumptions.
Key assumption is that the underlying signal lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs.
We propose a quadratic estimator, and show that it enjoys a statistical rate of order $sqrtfracklog Lm$, where $m$ is the number of samples.
arXiv Detail & Related papers (2022-03-18T01:48:16Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Exact and Approximation Algorithms for Sparse PCA [1.7640556247739623]
This paper proposes two exact mixed-integer SDPs (MISDPs)
We then analyze the theoretical optimality gaps of their continuous relaxation values and prove that they are stronger than that of the state-of-art one.
Since off-the-shelf solvers, in general, have difficulty in solving MISDPs, we approximate SPCA with arbitrary accuracy by a mixed-integer linear program (MILP) of a similar size as MISDPs.
arXiv Detail & Related papers (2020-08-28T02:07:08Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.