Towards a Data Privacy-Predictive Performance Trade-off
- URL: http://arxiv.org/abs/2201.05226v1
- Date: Thu, 13 Jan 2022 21:48:51 GMT
- Title: Towards a Data Privacy-Predictive Performance Trade-off
- Authors: T\^ania Carvalho, Nuno Moniz, Pedro Faria and Lu\'is Antunes
- Abstract summary: We evaluate the existence of a trade-off between data privacy and predictive performance in classification tasks.
Unlike previous literature, we confirm that the higher the level of privacy, the higher the impact on predictive performance.
- Score: 2.580765958706854
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning is increasingly used in the most diverse applications and
domains, whether in healthcare, to predict pathologies, or in the financial
sector to detect fraud. One of the linchpins for efficiency and accuracy in
machine learning is data utility. However, when it contains personal
information, full access may be restricted due to laws and regulations aiming
to protect individuals' privacy. Therefore, data owners must ensure that any
data shared guarantees such privacy. Removal or transformation of private
information (de-identification) are among the most common techniques.
Intuitively, one can anticipate that reducing detail or distorting information
would result in losses for model predictive performance. However, previous work
concerning classification tasks using de-identified data generally demonstrates
that predictive performance can be preserved in specific applications. In this
paper, we aim to evaluate the existence of a trade-off between data privacy and
predictive performance in classification tasks. We leverage a large set of
privacy-preserving techniques and learning algorithms to provide an assessment
of re-identification ability and the impact of transformed variants on
predictive performance. Unlike previous literature, we confirm that the higher
the level of privacy (lower re-identification risk), the higher the impact on
predictive performance, pointing towards clear evidence of a trade-off.
Related papers
- Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Certificates of Differential Privacy and Unlearning for Gradient-Based Training [35.18220120875603]
We propose a new verification-centric approach to privacy and unlearning guarantees.
Our framework offers a new verification-centric approach to privacy and unlearning guarantees.
We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing.
arXiv Detail & Related papers (2024-06-19T10:47:00Z) - A Summary of Privacy-Preserving Data Publishing in the Local Setting [0.6749750044497732]
Statistical Disclosure Control aims to minimize the risk of exposing confidential information by de-identifying it.
We outline the current privacy-preserving techniques employed in microdata de-identification, delve into privacy measures tailored for various disclosure scenarios, and assess metrics for information loss and predictive performance.
arXiv Detail & Related papers (2023-12-19T04:23:23Z) - $\alpha$-Mutual Information: A Tunable Privacy Measure for Privacy
Protection in Data Sharing [4.475091558538915]
This paper adopts Arimoto's $alpha$-Mutual Information as a tunable privacy measure.
We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection.
arXiv Detail & Related papers (2023-10-27T16:26:14Z) - Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight.
They only give tight estimates under implausible worst-case assumptions.
We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - A General Framework for Auditing Differentially Private Machine Learning [27.99806936918949]
We present a framework to statistically audit the privacy guarantee conferred by a differentially private machine learner in practice.
Our work develops a general methodology to empirically evaluate the privacy of differentially private machine learning implementations.
arXiv Detail & Related papers (2022-10-16T21:34:18Z) - No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z) - When Fairness Meets Privacy: Fair Classification with Semi-Private
Sensitive Attributes [18.221858247218726]
We study a novel and practical problem of fair classification in a semi-private setting.
Most of the sensitive attributes are private and only a small amount of clean ones are available.
We propose a novel framework FairSP that can achieve Fair prediction under the Semi-Private setting.
arXiv Detail & Related papers (2022-07-18T01:10:25Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - Differentially Private and Fair Deep Learning: A Lagrangian Dual
Approach [54.32266555843765]
This paper studies a model that protects the privacy of the individuals sensitive information while also allowing it to learn non-discriminatory predictors.
The method relies on the notion of differential privacy and the use of Lagrangian duality to design neural networks that can accommodate fairness constraints.
arXiv Detail & Related papers (2020-09-26T10:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.