Approximation Algorithms for Sparse Principal Component Analysis
- URL: http://arxiv.org/abs/2006.12748v2
- Date: Fri, 4 Jun 2021 16:25:42 GMT
- Title: Approximation Algorithms for Sparse Principal Component Analysis
- Authors: Agniva Chowdhury, Petros Drineas, David P. Woodruff, Samson Zhou
- Abstract summary: Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
- Score: 57.5357874512594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Principal component analysis (PCA) is a widely used dimension reduction
technique in machine learning and multivariate statistics. To improve the
interpretability of PCA, various approaches to obtain sparse principal
direction loadings have been proposed, which are termed Sparse Principal
Component Analysis (SPCA). In this paper, we present thresholding as a provably
accurate, polynomial time, approximation algorithm for the SPCA problem,
without imposing any restrictive assumptions on the input covariance matrix.
Our first thresholding algorithm using the Singular Value Decomposition is
conceptually simple; is faster than current state-of-the-art; and performs well
in practice. On the negative side, our (novel) theoretical bounds do not
accurately predict the strong practical performance of this approach. The
second algorithm solves a well-known semidefinite programming relaxation and
then uses a novel, two step, deterministic thresholding scheme to compute a
sparse principal vector. It works very well in practice and, remarkably, this
solid practical performance is accurately predicted by our theoretical bounds,
which bridge the theory-practice gap better than current state-of-the-art.
Related papers
- From Optimization to Control: Quasi Policy Iteration [3.4376560669160394]
We introduce a novel control algorithm coined as quasi-policy iteration (QPI)
QPI is based on a novel approximation of the "Hessian" matrix in the policy iteration algorithm by exploiting two linear structural constraints specific to MDPs.
It exhibits an empirical convergence behavior similar to policy iteration with a very low sensitivity to the discount factor.
arXiv Detail & Related papers (2023-11-18T21:00:14Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent [8.714458129632158]
Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables.
We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method.
It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
arXiv Detail & Related papers (2021-07-11T10:33:02Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret
Minimization [21.64869023699999]
We study the problem of estimating the mean of a distribution in high dimensions when either the samples are adversarially corrupted or the distribution is heavy-tailed.
We provide a meta-problem and a duality theorem that lead to a new unified view on robust and heavy-tailed mean estimation in high dimensions.
arXiv Detail & Related papers (2020-07-31T04:18:32Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - On the Convergence of the Dynamic Inner PCA Algorithm [5.9931120596636935]
DiPCA is a powerful method for the analysis of time-dependent data.
We show that this is a specialized variant of a coordinate decomposition algorithm.
We compare the performance of the decomposition strategies with that of the off-the-shelf It.
arXiv Detail & Related papers (2020-03-12T17:50:34Z) - Adaptive Approximate Policy Iteration [22.915651391812187]
We present a learning scheme which enjoys a $tildeO(T2/3)$ regret bound for undiscounted, continuing learning in uniformly ergodic MDPs.
This is an improvement over the best existing bound of $tildeO(T3/4)$ for the average-reward case with function approximation.
arXiv Detail & Related papers (2020-02-08T02:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.