NeuraCrypt: Hiding Private Health Data via Random Neural Networks for
Public Training
- URL: http://arxiv.org/abs/2106.02484v1
- Date: Fri, 4 Jun 2021 13:42:21 GMT
- Title: NeuraCrypt: Hiding Private Health Data via Random Neural Networks for
Public Training
- Authors: Adam Yala, Homa Esfahanizadeh, Rafael G. L. D' Oliveira, Ken R. Duffy,
Manya Ghobadi, Tommi S. Jaakkola, Vinod Vaikuntanathan, Regina Barzilay,
Muriel Medard
- Abstract summary: We propose NeuraCrypt, a private encoding scheme based on random deep neural networks.
NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner.
We show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks.
- Score: 64.54200987493573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Balancing the needs of data privacy and predictive utility is a central
challenge for machine learning in healthcare. In particular, privacy concerns
have led to a dearth of public datasets, complicated the construction of
multi-hospital cohorts and limited the utilization of external machine learning
resources. To remedy this, new methods are required to enable data owners, such
as hospitals, to share their datasets publicly, while preserving both patient
privacy and modeling utility. We propose NeuraCrypt, a private encoding scheme
based on random deep neural networks. NeuraCrypt encodes raw patient data using
a randomly constructed neural network known only to the data-owner, and
publishes both the encoded data and associated labels publicly. From a
theoretical perspective, we demonstrate that sampling from a sufficiently rich
family of encoding functions offers a well-defined and meaningful notion of
privacy against a computationally unbounded adversary with full knowledge of
the underlying data-distribution. We propose to approximate this family of
encoding functions through random deep neural networks. Empirically, we
demonstrate the robustness of our encoding to a suite of adversarial attacks
and show that NeuraCrypt achieves competitive accuracy to non-private baselines
on a variety of x-ray tasks. Moreover, we demonstrate that multiple hospitals,
using independent private encoders, can collaborate to train improved x-ray
models. Finally, we release a challenge dataset to encourage the development of
new attacks on NeuraCrypt.
Related papers
- A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user.
We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z) - Secure & Private Federated Neuroimaging [17.946206585229675]
Federated Learning enables distributed training of neural network models over multiple data sources without sharing data.
Each site trains the neural network over its private data for some time, then shares the neural network parameters with a Federation Controller.
Our Federated Learning architecture, MetisFL, provides strong security and privacy.
arXiv Detail & Related papers (2022-05-11T03:36:04Z) - Syfer: Neural Obfuscation for Private Data Release [58.490998583666276]
We develop Syfer, a neural obfuscation method to protect against re-identification attacks.
Syfer composes trained layers with random neural networks to encode the original data.
It maintains the ability to predict diagnoses from the encoded data.
arXiv Detail & Related papers (2022-01-28T20:32:04Z) - SoK: Privacy-preserving Deep Learning with Homomorphic Encryption [2.9069679115858755]
homomorphic encryption (HE) can be performed on encrypted data without revealing its content.
We take an in-depth look at approaches that combine neural networks with HE for privacy preservation.
We find numerous challenges to HE based privacy-preserving deep learning such as computational overhead, usability, and limitations posed by the encryption schemes.
arXiv Detail & Related papers (2021-12-23T22:03:27Z) - NeuralDP Differentially private neural networks by design [61.675604648670095]
We propose NeuralDP, a technique for privatising activations of some layer within a neural network.
We experimentally demonstrate on two datasets that our method offers substantially improved privacy-utility trade-offs compared to DP-SGD.
arXiv Detail & Related papers (2021-07-30T12:40:19Z) - POSEIDON: Privacy-Preserving Federated Neural Network Learning [8.103262600715864]
POSEIDON is a first of its kind in the regime of privacy-preserving neural network training.
It employs multiparty lattice-based cryptography to preserve the confidentiality of the training data, the model, and the evaluation data.
It trains a 3-layer neural network on the MNIST dataset with 784 features and 60K samples distributed among 10 parties in less than 2 hours.
arXiv Detail & Related papers (2020-09-01T11:06:31Z) - CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs)
CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.