Related papers: Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation

Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation

URL: http://arxiv.org/abs/2205.01950v1
Date: Wed, 4 May 2022 08:40:15 GMT
Title: Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation
Authors: Bishwas Mandal, George Amariucai, Shuangqing Wei
Abstract summary: We propose an adversarial learning framework that deals with the privacy-utility tradeoff problem under two conditions. Under data-type ignorant conditions, the privacy mechanism provides a one-hot encoding of categorical features, representing exactly one class. Under data-type aware conditions, the categorical variables are represented by a collection of scores, one for each class.
Score: 3.7315964084413173
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose an adversarial learning framework that deals with the privacy-utility tradeoff problem under two types of conditions: data-type ignorant, and data-type aware. Under data-type aware conditions, the privacy mechanism provides a one-hot encoding of categorical features, representing exactly one class, while under data-type ignorant conditions the categorical variables are represented by a collection of scores, one for each class. We use a neural network architecture consisting of a generator and a discriminator, where the generator consists of an encoder-decoder pair, and the discriminator consists of an adversary and a utility provider. Unlike previous research considering this kind of architecture, which leverages autoencoders (AEs) without introducing any randomness, or variational autoencoders (VAEs) based on learning latent representations which are then forced into a Gaussian assumption, our proposed technique introduces randomness and removes the Gaussian assumption restriction on the latent variables, only focusing on the end-to-end stochastic mapping of the input to privatized data. We test our framework on different datasets: MNIST, FashionMNIST, UCI Adult, and US Census Demographic Data, providing a wide range of possible private and utility attributes. We use multiple adversaries simultaneously to test our privacy mechanism -- some trained from the ground truth data and some trained from the perturbed data generated by our privacy mechanism. Through comparative analysis, our results demonstrate better privacy and utility guarantees than the existing works under similar, data-type ignorant conditions, even when the latter are considered under their original restrictive single-adversary model.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation [24.868697898254368]
Deep models may pose a privacy leakage risk in practical deployment. We propose a discriminative-generative distillation approach to learn privacy-preserving deep models. Our approach can control query cost over private data and accuracy degradation in a unified manner.
arXiv Detail & Related papers (2024-09-04T03:06:13Z)
Private prediction for large-scale synthetic text generation [28.488459921169905]
We present an approach for generating differentially private synthetic text using large language models (LLMs) In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees.
arXiv Detail & Related papers (2024-07-16T18:28:40Z)
Footprints of Data in a Classifier Model: The Privacy Issues and Their Mitigation through Data Obfuscation [0.9208007322096533]
embedding of footprints of training data in a prediction model is one such facet. difference in performance quality in test and training data causes passive identification of data that have trained the model. This research focuses on addressing the vulnerability arising from the data footprints.
arXiv Detail & Related papers (2024-07-02T13:56:37Z)
Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings. To address these concerns, privacy-preserving graph embedding methods have emerged. We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z)
Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees. Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks. We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z)
Approximate, Adapt, Anonymize (3A): a Framework for Privacy Preserving Training Data Release for Machine Learning [3.29354893777827]
We introduce a data release framework, 3A (Approximate, Adapt, Anonymize), to maximize data utility for machine learning. We present experimental evidence showing minimal discrepancy between performance metrics of models trained on real versus privatized datasets.
arXiv Detail & Related papers (2023-07-04T18:37:11Z)
PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user. We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z)
Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge. Existing private generative models are struggling with the utility of synthetic samples. We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z)
Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy [0.0]
A neural network based multi-class classifier is developed for failure prediction. The proposed mechanism completely decouples the data set used for training process from the actual data which is kept private. Results show high accuracy in failure prediction under different parameter configurations.
arXiv Detail & Related papers (2022-09-06T07:53:33Z)
Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process. We generate a representative as well as fair version of the UCI Adult census data set. We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z)
On Deep Learning with Label Differential Privacy [54.45348348861426]
We study the multi-class classification setting where the labels are considered sensitive and ought to be protected. We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets.
arXiv Detail & Related papers (2021-02-11T15:09:06Z)
Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions. We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.