Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
- URL: http://arxiv.org/abs/2407.09690v3
- Date: Fri, 6 Sep 2024 16:29:25 GMT
- Title: Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
- Authors: Changyu Gao, Andrew Lowy, Xingyu Zhou, Stephen J. Wright,
- Abstract summary: Inter-Silo Record-Level Differential Privacy (ISRL-DP) prevents each silo's data from being leaked.
We provide novel ISRL-DP FL algorithms that achieve the optimal excess risk bounds in the presence of heterogeneous silo data.
- Score: 12.620782629498812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We revisit the problem of federated learning (FL) with private data from people who do not trust the server or other silos/clients. In this context, every silo (e.g. hospital) has data from several people (e.g. patients) and needs to protect the privacy of each person's data (e.g. health records), even if the server and/or other silos try to uncover this data. Inter-Silo Record-Level Differential Privacy (ISRL-DP) prevents each silo's data from being leaked, by requiring that silo i's communications satisfy item-level differential privacy. Prior work arXiv:2106.09779 characterized the optimal excess risk bounds for ISRL-DP algorithms with homogeneous (i.i.d.) silo data and convex loss functions. However, two important questions were left open: (1) Can the same excess risk bounds be achieved with heterogeneous (non-i.i.d.) silo data? (2) Can the optimal risk bounds be achieved with fewer communication rounds? In this paper, we give positive answers to both questions. We provide novel ISRL-DP FL algorithms that achieve the optimal excess risk bounds in the presence of heterogeneous silo data. Moreover, our algorithms are more communication-efficient than the prior state-of-the-art. For smooth loss functions, our algorithm achieves the optimal excess risk bound and has communication complexity that matches the non-private lower bound. Additionally, our algorithms are more computationally efficient than the previous state-of-the-art.
Related papers
- Efficient Sparse Least Absolute Deviation Regression with Differential
Privacy [10.414082115202003]
We develop a fast privacy-preserving learning solution for a sparse robust regression problem.
Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem.
We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms.
arXiv Detail & Related papers (2024-01-02T17:13:34Z) - Protect Your Score: Contact Tracing With Differential Privacy Guarantees [68.53998103087508]
We argue that privacy concerns currently hold deployment back.
We propose a contact tracing algorithm with differential privacy guarantees against this attack.
Especially for realistic test scenarios, we achieve a two to ten-fold reduction in the infection rate of the virus.
arXiv Detail & Related papers (2023-12-18T11:16:33Z) - On Differentially Private Federated Linear Contextual Bandits [9.51828574518325]
We consider cross-silo federated linear contextual bandit (LCB) problem under differential privacy.
We identify three issues in the state-of-the-art: (i) failure of claimed privacy protection and (ii) incorrect regret bound due to noise miscalculation.
We show that our algorithm can achieve nearly optimal'' regret without a trusted server.
arXiv Detail & Related papers (2023-02-27T16:47:49Z) - On Differential Privacy and Adaptive Data Analysis with Bounded Space [76.10334958368618]
We study the space complexity of the two related fields of differential privacy and adaptive data analysis.
We show that there exists a problem P that requires exponentially more space to be solved efficiently with differential privacy.
The line of work on adaptive data analysis focuses on understanding the number of samples needed for answering a sequence of adaptive queries.
arXiv Detail & Related papers (2023-02-11T14:45:31Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Federated Offline Reinforcement Learning [55.326673977320574]
We propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites.
We design the first federated policy optimization algorithm for offline RL with sample complexity.
We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed.
arXiv Detail & Related papers (2022-06-11T18:03:26Z) - Private Non-Convex Federated Learning Without a Trusted Server [7.971065005161566]
We propose novel algorithms for cross-silo learning (FL) with non-trusted loss functions and data from people who do not trust other silos.
Our algorithms attain the optimal strongly convex, homogeneous (i.i.d.) for ISRL-DP FL without assuming convexity or i.i.d. data.
Numerical experiments show that our algorithm has better accuracy than baselines for most privacy levels.
arXiv Detail & Related papers (2022-03-13T19:17:15Z) - Private Federated Learning Without a Trusted Server: Optimal Algorithms
for Convex Losses [9.416757363901295]
Inter-Silo Record-Level Differential Privacy (ISRL-DP)
Inter-Silo Record-Level Differential Privacy (ISRL-DP)
arXiv Detail & Related papers (2021-06-17T19:41:23Z) - Lossless Compression of Efficient Private Local Randomizers [55.657133416044104]
Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting.
In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server.
This has led to significant efforts on reducing the communication cost of LDP algorithms.
arXiv Detail & Related papers (2021-02-24T07:04:30Z) - Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices.
An increasing number of such databases consist of data from multiple sources, not all of which can be trusted.
This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.