A Scalable Approach for Privacy-Preserving Collaborative Machine
Learning
- URL: http://arxiv.org/abs/2011.01963v1
- Date: Tue, 3 Nov 2020 19:09:55 GMT
- Title: A Scalable Approach for Privacy-Preserving Collaborative Machine
Learning
- Authors: Jinhyun So, Basak Guler, A. Salman Avestimehr
- Abstract summary: COPML is a fully-decentralized training framework that achieves scalability and privacy-protection simultaneously.
We provide the privacy analysis of COPML and prove its convergence.
We experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a collaborative learning scenario in which multiple data-owners
wish to jointly train a logistic regression model, while keeping their
individual datasets private from the other parties. We propose COPML, a
fully-decentralized training framework that achieves scalability and
privacy-protection simultaneously. The key idea of COPML is to securely encode
the individual datasets to distribute the computation load effectively across
many parties and to perform the training computations as well as the model
updates in a distributed manner on the securely encoded data. We provide the
privacy analysis of COPML and prove its convergence. Furthermore, we
experimentally demonstrate that COPML can achieve significant speedup in
training over the benchmark protocols. Our protocol provides strong statistical
privacy guarantees against colluding parties (adversaries) with unbounded
computational power, while achieving up to $16\times$ speedup in the training
time against the benchmark protocols.
Related papers
- Information-Theoretic Decentralized Secure Aggregation with Collusion Resilience [98.31540557973179]
We study the problem of decentralized secure aggregation (DSA) from an information-theoretic perspective.<n>We characterize the optimal rate region, which specifies the minimum achievable communication and secret key rates for DSA.<n>Our results establish the fundamental performance limits of DSA, providing insights for the design of provably secure and communication-efficient protocols.
arXiv Detail & Related papers (2025-08-01T12:51:37Z) - PC-MoE: Memory-Efficient and Privacy-Preserving Collaborative Training for Mixture-of-Experts LLMs [56.04036826558497]
We introduce Privacy-preserving Collaborative Mixture-of-Experts (PC-MoE)<n>By design, PC-MoE synergistically combines the strengths of distributed computation with strong confidentiality assurances.<n>It almost matches (and sometimes exceeds) the performance and convergence rate of a fully centralized model, enjoys near 70% peak GPU RAM reduction, while being fully robust against reconstruction attacks.
arXiv Detail & Related papers (2025-06-03T15:00:18Z) - Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.
Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.
We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - The Communication-Friendly Privacy-Preserving Machine Learning against Malicious Adversaries [14.232901861974819]
Privacy-preserving machine learning (PPML) is an innovative approach that allows for secure data analysis while safeguarding sensitive information.
We introduce efficient protocol for secure linear function evaluation.
We extend the protocol to handle linear and non-linear layers, ensuring compatibility with a wide range of machine-learning models.
arXiv Detail & Related papers (2024-11-14T08:55:14Z) - A Quantization-based Technique for Privacy Preserving Distributed Learning [2.2139875218234475]
We describe a novel, regulation-compliant data protection technique for the distributed training of Machine Learning models.
Our method protects both training data and ML model parameters by employing a protocol based on a quantized multi-hash data representation Hash-Comb combined with randomization.
arXiv Detail & Related papers (2024-06-26T14:54:12Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - Conformal Prediction for Federated Uncertainty Quantification Under
Label Shift [57.54977668978613]
Federated Learning (FL) is a machine learning framework where many clients collaboratively train models.
We develop a new conformal prediction method based on quantile regression and take into account privacy constraints.
arXiv Detail & Related papers (2023-06-08T11:54:58Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z) - Byzantine-Robust Federated Learning with Optimal Statistical Rates and
Privacy Guarantees [123.0401978870009]
We propose Byzantine-robust federated learning protocols with nearly optimal statistical rates.
We benchmark against competing protocols and show the empirical superiority of the proposed protocols.
Our protocols with bucketing can be naturally combined with privacy-guaranteeing procedures to introduce security against a semi-honest server.
arXiv Detail & Related papers (2022-05-24T04:03:07Z) - Stochastic Coded Federated Learning with Convergence and Privacy
Guarantees [8.2189389638822]
Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework.
This paper proposes a coded federated learning framework, namely coded federated learning (SCFL) to mitigate the straggler issue.
We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning.
arXiv Detail & Related papers (2022-01-25T04:43:29Z) - Scotch: An Efficient Secure Computation Framework for Secure Aggregation [0.0]
Federated learning enables multiple data owners to jointly train a machine learning model without revealing their private datasets.
A malicious aggregation server might use the model parameters to derive sensitive information about the training dataset used.
We propose textscScotch, a decentralized textitm-party secure-computation framework for federated aggregation.
arXiv Detail & Related papers (2022-01-19T17:16:35Z) - Cooperative Multi-Agent Actor-Critic for Privacy-Preserving Load
Scheduling in a Residential Microgrid [71.17179010567123]
We propose a privacy-preserving multi-agent actor-critic framework where the decentralized actors are trained with distributed critics.
The proposed framework can preserve the privacy of the households while simultaneously learn the multi-agent credit assignment mechanism implicitly.
arXiv Detail & Related papers (2021-10-06T14:05:26Z) - Weight Divergence Driven Divide-and-Conquer Approach for Optimal
Federated Learning from non-IID Data [0.0]
Federated Learning allows training of data stored in distributed devices without the need for centralizing training data.
We propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm.
arXiv Detail & Related papers (2021-06-28T09:34:20Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.