Related papers: FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

URL: http://arxiv.org/abs/2108.12373v1
Date: Fri, 27 Aug 2021 16:10:59 GMT
Title: FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis
Authors: Arpita Gang and Waheed U. Bajwa
Abstract summary: Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. This paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA)
Score: 12.91948651812873
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often reduced to dimension reduction, the purpose of PCA is actually two-fold: dimension reduction and feature learning. Furthermore, the enormity of the dimensions and sample size in the modern day datasets have rendered the centralized PCA solutions unusable. In that vein, this paper reconsiders the problem of PCA when data samples are distributed across nodes in an arbitrarily connected network. While a few solutions for distributed PCA exist those either overlook the feature learning part of the purpose, have communication overhead making them inefficient and/or lack exact convergence guarantees. To combat these aforementioned issues, this paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA). The proposed algorithm is efficient in terms of communication and can be proved to converge linearly and exactly to the principal components that lead to dimension reduction as well as uncorrelated features. Our claims are further supported by experimental results.

Related papers

Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA [43.106438224356175]
We develop a nearly-linear time algorithm for robust PCA with near-optimal error guarantees. We also develop a single-pass streaming algorithm for robust PCA with memory usage nearly-linear in the dimension.
arXiv Detail & Related papers (2023-05-04T04:45:16Z)
Accelerating Wireless Federated Learning via Nesterov's Momentum and Distributed Principle Component Analysis [59.127630388320036]
A wireless federated learning system is investigated by allowing a server and workers to exchange uncoded information via wireless channels. Since the workers frequently upload local to the server via bandwidth-limited channels, the uplink transmission from the workers to the server becomes a communication bottleneck. A one-shot distributed principle component analysis (PCA) is leveraged to reduce the dimension of the dimension of the communication bottleneck.
arXiv Detail & Related papers (2023-03-31T08:41:42Z)
Efficient fair PCA for fair representation learning [21.990310743597174]
We propose a conceptually simple approach that allows for an analytic solution similar to standard PCA and can be kernelized. Our methods have the same complexity as standard PCA, or kernel PCA, and run much faster than existing methods for fair PCA based on semidefinite programming or manifold optimization.
arXiv Detail & Related papers (2023-02-26T13:34:43Z)
An online algorithm for contrastive Principal Component Analysis [9.090031210111919]
We derive an online algorithm for cPCA* and show that it maps onto a neural network with local learning rules, so it can potentially be implemented in energy efficient neuromorphic hardware. We evaluate the performance of our online algorithm on real datasets and highlight the differences and similarities with the original formulation.
arXiv Detail & Related papers (2022-11-14T19:48:48Z)
Distributed Robust Principal Analysis [0.0]
We study the robust principal component analysis problem in a distributed setting. We propose the first distributed robust principal analysis algorithm based on consensus factorization, dubbed DCF-PCA.
arXiv Detail & Related papers (2022-07-24T05:45:07Z)
AgFlow: Fast Model Selection of Penalized PCA via Implicit Regularization Effects of Gradient Flow [64.81110234990888]
Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction. In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings. We propose Approximated Gradient Flow (AgFlow) as a fast model selection method for penalized PCA.
arXiv Detail & Related papers (2021-10-07T08:57:46Z)
Turning Channel Noise into an Accelerator for Over-the-Air Principal Component Analysis [65.31074639627226]
Principal component analysis (PCA) is a technique for extracting the linear structure of a dataset. We propose the deployment of PCA over a multi-access channel based on the algorithm of gradient descent. Over-the-air aggregation is adopted to reduce the multi-access latency, giving the name over-the-air PCA.
arXiv Detail & Related papers (2021-04-20T16:28:33Z)
Enhanced Principal Component Analysis under A Collaborative-Robust Framework [89.28334359066258]
We introduce a general collaborative-robust weight learning framework that combines weight learning and robust loss in a non-trivial way. Under the proposed framework, only a part of well-fitting samples are activated which indicates more importance during training, and others, whose errors are large, will not be ignored. In particular, the negative effects of inactivated samples are alleviated by the robust loss function.
arXiv Detail & Related papers (2021-03-22T15:17:37Z)
A Linearly Convergent Algorithm for Distributed Principal Component Analysis [12.91948651812873]
This paper introduces a feedforward neural network-based one time-scale distributed PCA algorithm termed Distributed Sanger's Algorithm (DSA) The proposed algorithm is shown to converge linearly to a neighborhood of the true solution.
arXiv Detail & Related papers (2021-01-05T00:51:14Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.