Distributed Robust Principal Analysis
- URL: http://arxiv.org/abs/2207.11669v1
- Date: Sun, 24 Jul 2022 05:45:07 GMT
- Title: Distributed Robust Principal Analysis
- Authors: Wenda Chu
- Abstract summary: We study the robust principal component analysis problem in a distributed setting.
We propose the first distributed robust principal analysis algorithm based on consensus factorization, dubbed DCF-PCA.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the robust principal component analysis (RPCA) problem in a
distributed setting. The goal of RPCA is to find an underlying low-rank
estimation for a raw data matrix when the data matrix is subject to the
corruption of gross sparse errors. Previous studies have developed RPCA
algorithms that provide stable solutions with fast convergence. However, these
algorithms are typically hard to scale and cannot be implemented distributedly,
due to the use of either SVD or large matrix multiplication. In this paper, we
propose the first distributed robust principal analysis algorithm based on
consensus factorization, dubbed DCF-PCA. We prove the convergence of DCF-PCA
and evaluate DCF-PCA on various problem setting
Related papers
- Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data [55.41644538483948]
Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
arXiv Detail & Related papers (2022-08-28T21:10:56Z) - Robust factored principal component analysis for matrix-valued outlier
accommodation and detection [4.228971753938522]
Factored PCA (FPCA) is a probabilistic extension of PCA for matrix data.
We propose a robust extension of FPCA (RFPCA) for matrix data.
RFPCA can adaptively down-weight outliers and yield robust estimates.
arXiv Detail & Related papers (2021-12-13T16:12:22Z) - FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning.
This paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA)
arXiv Detail & Related papers (2021-08-27T16:10:59Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Federated Deep AUC Maximization for Heterogeneous Data with a Constant
Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data.
A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level.
Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z) - A Linearly Convergent Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
This paper introduces a feedforward neural network-based one time-scale distributed PCA algorithm termed Distributed Sanger's Algorithm (DSA)
The proposed algorithm is shown to converge linearly to a neighborhood of the true solution.
arXiv Detail & Related papers (2021-01-05T00:51:14Z) - Exact and Approximation Algorithms for Sparse PCA [1.7640556247739623]
This paper proposes two exact mixed-integer SDPs (MISDPs)
We then analyze the theoretical optimality gaps of their continuous relaxation values and prove that they are stronger than that of the state-of-art one.
Since off-the-shelf solvers, in general, have difficulty in solving MISDPs, we approximate SPCA with arbitrary accuracy by a mixed-integer linear program (MILP) of a similar size as MISDPs.
arXiv Detail & Related papers (2020-08-28T02:07:08Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - One-shot Distibuted Algorithm for PCA with RBF Kernels [23.266613551011638]
Our algorithm is inspired by the dual relationship between sample-distributed and feature-distributed scenario.
In theoretical part, we analyze the approximation error for both linear and RBF kernels.
The result suggests that when eigenvalues decay fast, the proposed algorithm gives high quality results with low communication cost.
arXiv Detail & Related papers (2020-05-06T09:07:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.