Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm
- URL: http://arxiv.org/abs/2411.15660v1
- Date: Sat, 23 Nov 2024 21:57:50 GMT
- Title: Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm
- Authors: Jingyang Li, T. Tony Cai, Dong Xia, Anru R. Zhang,
- Abstract summary: Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security.
This paper investigates federated PCA and estimation for spiked covariance matrices under distributed differential privacy constraints.
We establish minimax rates of convergence, with a key finding that the central server's optimal rate is the harmonic mean of the local clients' minimax rates.
- Score: 19.673557166734977
- License:
- Abstract: Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security, making it indispensable in fields such as healthcare, finance, and personalized services. This paper investigates federated PCA and estimation for spiked covariance matrices under distributed differential privacy constraints. We establish minimax rates of convergence, with a key finding that the central server's optimal rate is the harmonic mean of the local clients' minimax rates. This guarantees consistent estimation at the central server as long as at least one local client provides consistent results. Notably, consistency is maintained even if some local estimators are inconsistent, provided there are enough clients. These findings highlight the robustness and scalability of FL for reliable statistical inference under privacy constraints. To establish minimax lower bounds, we derive a matrix version of van Trees' inequality, which is of independent interest. Furthermore, we propose an efficient algorithm that preserves differential privacy while achieving near-optimal rates at the central server, up to a logarithmic factor. We address significant technical challenges in analyzing this algorithm, which involves a three-layer spectral decomposition. Numerical performance of the proposed algorithm is investigated using both simulated and real data.
Related papers
- Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.
Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.
We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z) - Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS)
In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems.
Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z) - On the Privacy-Robustness-Utility Trilemma in Distributed Learning [7.778461949427662]
We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines.
Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility.
arXiv Detail & Related papers (2023-02-09T17:24:18Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Differentially Private Decentralized Optimization with Relay Communication [1.2695958417031445]
We introduce a new measure: Privacy Leakage Frequency (PLF), which reveals the relationship between communication and privacy leakage of algorithms.
A novel differentially private decentralized primal--dual algorithm named DP-RECAL is proposed to take advantage of operator splitting method and relay communication mechanism to experience less PLF.
arXiv Detail & Related papers (2022-12-21T09:05:36Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Stochastic Coded Federated Learning with Convergence and Privacy
Guarantees [8.2189389638822]
Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework.
This paper proposes a coded federated learning framework, namely coded federated learning (SCFL) to mitigate the straggler issue.
We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning.
arXiv Detail & Related papers (2022-01-25T04:43:29Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.