Syfer: Neural Obfuscation for Private Data Release
- URL: http://arxiv.org/abs/2201.12406v1
- Date: Fri, 28 Jan 2022 20:32:04 GMT
- Title: Syfer: Neural Obfuscation for Private Data Release
- Authors: Adam Yala, Victor Quach, Homa Esfahanizadeh, Rafael G. L. D'Oliveira,
Ken R. Duffy, Muriel M\'edard, Tommi S. Jaakkola, Regina Barzilay
- Abstract summary: We develop Syfer, a neural obfuscation method to protect against re-identification attacks.
Syfer composes trained layers with random neural networks to encode the original data.
It maintains the ability to predict diagnoses from the encoded data.
- Score: 58.490998583666276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Balancing privacy and predictive utility remains a central challenge for
machine learning in healthcare. In this paper, we develop Syfer, a neural
obfuscation method to protect against re-identification attacks. Syfer composes
trained layers with random neural networks to encode the original data (e.g.
X-rays) while maintaining the ability to predict diagnoses from the encoded
data. The randomness in the encoder acts as the private key for the data owner.
We quantify privacy as the number of attacker guesses required to re-identify a
single image (guesswork). We propose a contrastive learning algorithm to
estimate guesswork. We show empirically that differentially private methods,
such as DP-Image, obtain privacy at a significant loss of utility. In contrast,
Syfer achieves strong privacy while preserving utility. For example, X-ray
classifiers built with DP-image, Syfer, and original data achieve average AUCs
of 0.53, 0.78, and 0.86, respectively.
Related papers
- A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data [14.748203847227542]
We develop a novel algorithm for private and fair federated learning (FL)
Our algorithm satisfies inter-silo record-level differential privacy (ISRL-DP)
Experiments demonstrate the state-of-the-art fairness-accuracy framework tradeoffs of our algorithm across different privacy levels.
arXiv Detail & Related papers (2024-11-12T15:51:35Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Privacy-Preserved Neural Graph Similarity Learning [99.78599103903777]
We propose a novel Privacy-Preserving neural Graph Matching network model, named PPGM, for graph similarity learning.
To prevent reconstruction attacks, the proposed model does not communicate node-level representations between devices.
To alleviate the attacks to graph properties, the obfuscated features that contain information from both vectors are communicated.
arXiv Detail & Related papers (2022-10-21T04:38:25Z) - Hiding Images in Deep Probabilistic Models [58.23127414572098]
We describe a different computational framework to hide images in deep probabilistic models.
Specifically, we use a DNN to model the probability density of cover images, and hide a secret image in one particular location of the learned distribution.
We demonstrate the feasibility of our SinGAN approach in terms of extraction accuracy and model security.
arXiv Detail & Related papers (2022-10-05T13:33:25Z) - Fine-Tuning with Differential Privacy Necessitates an Additional
Hyperparameter Search [38.83524780461911]
We show how carefully selecting the layers being fine-tuned in the pretrained neural network allows us to establish new state-of-the-art tradeoffs between privacy and accuracy.
We achieve 77.9% accuracy for $(varepsilon, delta)= (2, 10-5)$ on CIFAR-100 for a model pretrained on ImageNet.
arXiv Detail & Related papers (2022-10-05T11:32:49Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Unintended memorisation of unique features in neural networks [15.174895411434026]
We show that unique features occurring only once in training data are memorised by discriminative multi-layer perceptrons and convolutional neural networks.
We develop a score estimating a model's sensitivity to a unique feature by comparing the KL divergences of the model's output distributions.
We find that typical strategies to prevent overfitting do not prevent unique feature memorisation.
arXiv Detail & Related papers (2022-05-20T10:48:18Z) - Measuring Unintended Memorisation of Unique Private Features in Neural
Networks [15.174895411434026]
We show that neural networks unintentionally memorise unique features even when they occur only once in training data.
An example of a unique feature is a person's name that is accidentally present on a training image.
arXiv Detail & Related papers (2022-02-16T14:39:05Z) - NeuraCrypt: Hiding Private Health Data via Random Neural Networks for
Public Training [64.54200987493573]
We propose NeuraCrypt, a private encoding scheme based on random deep neural networks.
NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner.
We show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks.
arXiv Detail & Related papers (2021-06-04T13:42:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.