Correlated Differential Privacy: Feature Selection in Machine Learning
- URL: http://arxiv.org/abs/2010.03094v1
- Date: Wed, 7 Oct 2020 00:33:24 GMT
- Title: Correlated Differential Privacy: Feature Selection in Machine Learning
- Authors: Tao Zhang, Tianqing Zhu, Ping Xiong, Huan Huo, Zahir Tari, Wanlei Zhou
- Abstract summary: The proposed scheme involves five steps with the goal of managing the extent of data correlation, preserving the privacy, and supporting accuracy in the prediction results.
Experiments show that the proposed scheme can produce better prediction results with machine learning tasks and fewer mean square errors for data queries compared to existing schemes.
- Score: 13.477069421691562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Privacy preserving in machine learning is a crucial issue in industry
informatics since data used for training in industries usually contain
sensitive information. Existing differentially private machine learning
algorithms have not considered the impact of data correlation, which may lead
to more privacy leakage than expected in industrial applications. For example,
data collected for traffic monitoring may contain some correlated records due
to temporal correlation or user correlation. To fill this gap, we propose a
correlation reduction scheme with differentially private feature selection
considering the issue of privacy loss when data have correlation in machine
learning tasks. %The key to the proposed scheme is to describe the data
correlation and select features which leads to less data correlation across the
whole dataset. The proposed scheme involves five steps with the goal of
managing the extent of data correlation, preserving the privacy, and supporting
accuracy in the prediction results. In this way, the impact of data correlation
is relieved with the proposed feature selection scheme, and moreover, the
privacy issue of data correlation in learning is guaranteed. The proposed
method can be widely used in machine learning algorithms which provide services
in industrial areas. Experiments show that the proposed scheme can produce
better prediction results with machine learning tasks and fewer mean square
errors for data queries compared to existing schemes.
Related papers
- Causal Discovery Under Local Privacy [5.7835417207193585]
Local differential privacy is a variant that allows data providers to apply the privatization mechanism themselves on their data individually.
We consider various well-known locally differentially private mechanisms and compare the trade-off between the privacy they provide and the accuracy of the causal structure produced by algorithms for causal learning.
arXiv Detail & Related papers (2023-11-07T14:44:27Z) - Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Approximate, Adapt, Anonymize (3A): a Framework for Privacy Preserving
Training Data Release for Machine Learning [3.29354893777827]
We introduce a data release framework, 3A (Approximate, Adapt, Anonymize), to maximize data utility for machine learning.
We present experimental evidence showing minimal discrepancy between performance metrics of models trained on real versus privatized datasets.
arXiv Detail & Related papers (2023-07-04T18:37:11Z) - Striving for data-model efficiency: Identifying data externalities on
group performance [75.17591306911015]
Building trustworthy, effective, and responsible machine learning systems hinges on understanding how differences in training data and modeling decisions interact to impact predictive performance.
We focus on a particular type of data-model inefficiency, in which adding training data from some sources can actually lower performance evaluated on key sub-groups of the population.
Our results indicate that data-efficiency is a key component of both accurate and trustworthy machine learning.
arXiv Detail & Related papers (2022-11-11T16:48:27Z) - Privacy-Preserving Machine Learning for Collaborative Data Sharing via
Auto-encoder Latent Space Embeddings [57.45332961252628]
Privacy-preserving machine learning in data-sharing processes is an ever-critical task.
This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data.
arXiv Detail & Related papers (2022-11-10T17:36:58Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - On Deep Learning with Label Differential Privacy [54.45348348861426]
We study the multi-class classification setting where the labels are considered sensitive and ought to be protected.
We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets.
arXiv Detail & Related papers (2021-02-11T15:09:06Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - A Critical Overview of Privacy-Preserving Approaches for Collaborative
Forecasting [0.0]
Cooperation between different data owners may lead to an improvement in forecast quality.
Due to business competitive factors and personal data protection questions, said data owners might be unwilling to share their data.
This paper analyses the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy.
arXiv Detail & Related papers (2020-04-20T20:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.