A Survey on Differential Privacy with Machine Learning and Future
Outlook
- URL: http://arxiv.org/abs/2211.10708v1
- Date: Sat, 19 Nov 2022 14:20:53 GMT
- Title: A Survey on Differential Privacy with Machine Learning and Future
Outlook
- Authors: Samah Baraheem and Zhongmei Yao
- Abstract summary: differential privacy is used to protect machine learning models from any attacks and vulnerabilities.
This survey paper presents different differentially private machine learning algorithms categorized into two main categories.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, machine learning models and applications have become increasingly
pervasive. With this rapid increase in the development and employment of
machine learning models, a concern regarding privacy has risen. Thus, there is
a legitimate need to protect the data from leaking and from any attacks. One of
the strongest and most prevalent privacy models that can be used to protect
machine learning models from any attacks and vulnerabilities is differential
privacy (DP). DP is strict and rigid definition of privacy, where it can
guarantee that an adversary is not capable to reliably predict if a specific
participant is included in the dataset or not. It works by injecting a noise to
the data whether to the inputs, the outputs, the ground truth labels, the
objective functions, or even to the gradients to alleviate the privacy issue
and protect the data. To this end, this survey paper presents different
differentially private machine learning algorithms categorized into two main
categories (traditional machine learning models vs. deep learning models).
Moreover, future research directions for differential privacy with machine
learning algorithms are outlined.
Related papers
- Game-Theoretic Machine Unlearning: Mitigating Extra Privacy Leakage [12.737028324709609]
Recent legislation obligates organizations to remove requested data and its influence from a trained model.
We propose a game-theoretic machine unlearning algorithm that simulates the competitive relationship between unlearning performance and privacy protection.
arXiv Detail & Related papers (2024-11-06T13:47:04Z) - FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation [4.772368796656325]
In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments.
We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task.
arXiv Detail & Related papers (2024-10-30T02:41:26Z) - Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs.
PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information.
For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees.
We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z) - Approximate, Adapt, Anonymize (3A): a Framework for Privacy Preserving
Training Data Release for Machine Learning [3.29354893777827]
We introduce a data release framework, 3A (Approximate, Adapt, Anonymize), to maximize data utility for machine learning.
We present experimental evidence showing minimal discrepancy between performance metrics of models trained on real versus privatized datasets.
arXiv Detail & Related papers (2023-07-04T18:37:11Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Statistical Privacy Guarantees of Machine Learning Preprocessing
Techniques [1.198727138090351]
We adapt a privacy violation detection framework based on statistical methods to measure privacy levels of machine learning pipelines.
We apply the newly created framework to show that resampling techniques used when dealing with imbalanced datasets cause the resultant model to leak more privacy.
arXiv Detail & Related papers (2021-09-06T14:08:47Z) - More Than Privacy: Applying Differential Privacy in Key Areas of
Artificial Intelligence [62.3133247463974]
We show that differential privacy can do more than just privacy preservation in AI.
It can also be used to improve security, stabilize learning, build fair models, and impose composition in selected areas of AI.
arXiv Detail & Related papers (2020-08-05T03:07:36Z) - Differentially Private Deep Learning with Smooth Sensitivity [144.31324628007403]
We study privacy concerns through the lens of differential privacy.
In this framework, privacy guarantees are generally obtained by perturbing models in such a way that specifics of data used to train the model are made ambiguous.
One of the most important techniques used in previous works involves an ensemble of teacher models, which return information to a student based on a noisy voting procedure.
In this work, we propose a novel voting mechanism with smooth sensitivity, which we call Immutable Noisy ArgMax, that, under certain conditions, can bear very large random noising from the teacher without affecting the useful information transferred to the student
arXiv Detail & Related papers (2020-03-01T15:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.