CaPC Learning: Confidential and Private Collaborative Learning
- URL: http://arxiv.org/abs/2102.05188v1
- Date: Tue, 9 Feb 2021 23:50:24 GMT
- Title: CaPC Learning: Confidential and Private Collaborative Learning
- Authors: Christopher A. Choquette-Choo, Natalie Dullerud, Adam Dziedzic,
Yunxiang Zhang, Somesh Jha, Nicolas Papernot, Xiao Wang
- Abstract summary: We introduce Confidential and Private Collaborative (CaPC) learning, the first method provably achieving both confidentiality and privacy in a collaborative setting.
We demonstrate how CaPC allows participants to collaborate without having to explicitly join their training sets or train a central model.
- Score: 30.403853588224987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning benefits from large training datasets, which may not always
be possible to collect by any single entity, especially when using
privacy-sensitive data. In many contexts, such as healthcare and finance,
separate parties may wish to collaborate and learn from each other's data but
are prevented from doing so due to privacy regulations. Some regulations
prevent explicit sharing of data between parties by joining datasets in a
central location (confidentiality). Others also limit implicit sharing of data,
e.g., through model predictions (privacy). There is currently no method that
enables machine learning in such a setting, where both confidentiality and
privacy need to be preserved, to prevent both explicit and implicit sharing of
data. Federated learning only provides confidentiality, not privacy, since
gradients shared still contain private information. Differentially private
learning assumes unreasonably large datasets. Furthermore, both of these
learning paradigms produce a central model whose architecture was previously
agreed upon by all parties rather than enabling collaborative learning where
each party learns and improves their own local model. We introduce Confidential
and Private Collaborative (CaPC) learning, the first method provably achieving
both confidentiality and privacy in a collaborative setting. We leverage secure
multi-party computation (MPC), homomorphic encryption (HE), and other
techniques in combination with privately aggregated teacher models. We
demonstrate how CaPC allows participants to collaborate without having to
explicitly join their training sets or train a central model. Each party is
able to improve the accuracy and fairness of their model, even in settings
where each party has a model that performs well on their own dataset or when
datasets are not IID and model architectures are heterogeneous across parties.
Related papers
- Towards Split Learning-based Privacy-Preserving Record Linkage [49.1574468325115]
Split Learning has been introduced to facilitate applications where user data privacy is a requirement.
In this paper, we investigate the potentials of Split Learning for Privacy-Preserving Record Matching.
arXiv Detail & Related papers (2024-09-02T09:17:05Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - Privacy-Preserving Machine Learning for Collaborative Data Sharing via
Auto-encoder Latent Space Embeddings [57.45332961252628]
Privacy-preserving machine learning in data-sharing processes is an ever-critical task.
This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data.
arXiv Detail & Related papers (2022-11-10T17:36:58Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Personalized PATE: Differential Privacy for Machine Learning with
Individual Privacy Guarantees [1.2691047660244335]
We propose three novel methods to support training an ML model with different personalized privacy guarantees within the training data.
Our experiments show that our personalized privacy methods yield higher accuracy models than the non-personalized baseline.
arXiv Detail & Related papers (2022-02-21T20:16:27Z) - Personalization Improves Privacy-Accuracy Tradeoffs in Federated
Optimization [57.98426940386627]
We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy.
We illustrate our theoretical results with experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-02-10T20:44:44Z) - ABG: A Multi-Party Mixed Protocol Framework for Privacy-Preserving
Cooperative Learning [13.212198032364363]
We propose a privacy-preserving multi-party cooperative learning system, which allows different data owners to cooperate in machine learning.
We also design specific privacy-preserving computation protocols for some typical machine learning methods such as logistic regression and neural networks.
The experiments indicate that ABG$n$ has excellent performance, especially in the network environment with low latency.
arXiv Detail & Related papers (2022-02-07T03:57:57Z) - Reliability Check via Weight Similarity in Privacy-Preserving
Multi-Party Machine Learning [7.552100672006174]
We focus on addressing the concerns of data privacy, model privacy, and data quality associated with multi-party machine learning.
We present a scheme for privacy-preserving collaborative learning that checks the participants' data quality while guaranteeing data and model privacy.
arXiv Detail & Related papers (2021-01-14T08:55:42Z) - Differentially Private Secure Multi-Party Computation for Federated
Learning in Financial Applications [5.50791468454604]
Federated learning enables a population of clients, working with a trusted server, to collaboratively learn a shared machine learning model.
This reduces the risk of exposing sensitive data, but it is still possible to reverse engineer information about a client's private data set from communicated model parameters.
We present a privacy-preserving federated learning protocol to a non-specialist audience, demonstrate it using logistic regression on a real-world credit card fraud data set, and evaluate it using an open-source simulation platform.
arXiv Detail & Related papers (2020-10-12T17:16:27Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z) - The Cost of Privacy in Asynchronous Differentially-Private Machine
Learning [17.707240607542236]
We develop differentially-private asynchronous algorithms for collaboratively training machine-learning models on multiple private datasets.
A central learner interacts with the private data owners one-on-one whenever they are available for communication.
We prove that we can forecast the performance of the proposed privacy-preserving asynchronous algorithms.
arXiv Detail & Related papers (2020-03-18T23:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.