Pencil: Private and Extensible Collaborative Learning without the Non-Colluding Assumption
- URL: http://arxiv.org/abs/2403.11166v1
- Date: Sun, 17 Mar 2024 10:26:41 GMT
- Title: Pencil: Private and Extensible Collaborative Learning without the Non-Colluding Assumption
- Authors: Xuanqi Liu, Zhuotao Liu, Qi Li, Ke Xu, Mingwei Xu,
- Abstract summary: Pencil is the first private training framework for collaborative learning that simultaneously offers data privacy, model privacy, and extensibility to multiple data providers.
We introduce several novel cryptographic protocols to realize this design principle and conduct a rigorous security and privacy analysis.
Pencil achieves 10 260x higher throughput and 2 orders of magnitude less communication than prior art.
- Score: 24.339382371386876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The escalating focus on data privacy poses significant challenges for collaborative neural network training, where data ownership and model training/deployment responsibilities reside with distinct entities. Our community has made substantial contributions to addressing this challenge, proposing various approaches such as federated learning (FL) and privacy-preserving machine learning based on cryptographic constructs like homomorphic encryption (HE) and secure multiparty computation (MPC). However, FL completely overlooks model privacy, and HE has limited extensibility (confined to only one data provider). While the state-of-the-art MPC frameworks provide reasonable throughput and simultaneously ensure model/data privacy, they rely on a critical non-colluding assumption on the computing servers, and relaxing this assumption is still an open problem. In this paper, we present Pencil, the first private training framework for collaborative learning that simultaneously offers data privacy, model privacy, and extensibility to multiple data providers, without relying on the non-colluding assumption. Our fundamental design principle is to construct the n-party collaborative training protocol based on an efficient two-party protocol, and meanwhile ensuring that switching to different data providers during model training introduces no extra cost. We introduce several novel cryptographic protocols to realize this design principle and conduct a rigorous security and privacy analysis. Our comprehensive evaluations of Pencil demonstrate that (i) models trained in plaintext and models trained privately using Pencil exhibit nearly identical test accuracies; (ii) The training overhead of Pencil is greatly reduced: Pencil achieves 10 ~ 260x higher throughput and 2 orders of magnitude less communication than prior art; (iii) Pencil is resilient against both existing and adaptive (white-box) attacks.
Related papers
- Federated Face Forgery Detection Learning with Personalized Representation [63.90408023506508]
Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat.
Traditional forgery detection methods directly centralized training on data.
The paper proposes a novel federated face forgery detection learning with personalized representation.
arXiv Detail & Related papers (2024-06-17T02:20:30Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - When approximate design for fast homomorphic computation provides
differential privacy guarantees [0.08399688944263842]
Differential privacy (DP) and cryptographic primitives are popular countermeasures against privacy attacks.
In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator.
Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework.
arXiv Detail & Related papers (2023-04-06T09:38:01Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - Protecting Data from all Parties: Combining FHE and DP in Federated
Learning [0.09176056742068812]
We propose a secure framework addressing an extended threat model with respect to privacy of the training data.
The proposed framework protects the privacy of the training data from all participants, namely the training data owners and an aggregating server.
By means of a novel quantization operator, we prove differential privacy guarantees in a context where the noise is quantified and bounded due to the use of homomorphic encryption.
arXiv Detail & Related papers (2022-05-09T14:33:44Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - Efficient Differentially Private Secure Aggregation for Federated
Learning via Hardness of Learning with Errors [1.4680035572775534]
Federated machine learning leverages edge computing to develop models from network user data.
Privacy in federated learning remains a major challenge.
Recent advances in emphsecure aggregation using multiparty computation eliminate the need for a third party.
We present a new federated learning protocol that leverages a novel differentially private, malicious secure aggregation protocol.
arXiv Detail & Related papers (2021-12-13T18:31:08Z) - On Deep Learning with Label Differential Privacy [54.45348348861426]
We study the multi-class classification setting where the labels are considered sensitive and ought to be protected.
We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets.
arXiv Detail & Related papers (2021-02-11T15:09:06Z) - Reliability Check via Weight Similarity in Privacy-Preserving
Multi-Party Machine Learning [7.552100672006174]
We focus on addressing the concerns of data privacy, model privacy, and data quality associated with multi-party machine learning.
We present a scheme for privacy-preserving collaborative learning that checks the participants' data quality while guaranteeing data and model privacy.
arXiv Detail & Related papers (2021-01-14T08:55:42Z) - SPEED: Secure, PrivatE, and Efficient Deep learning [2.283665431721732]
We introduce a deep learning framework able to deal with strong privacy constraints.
Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art.
arXiv Detail & Related papers (2020-06-16T19:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.