I Prefer not to Say: Protecting User Consent in Models with Optional
Personal Data
- URL: http://arxiv.org/abs/2210.13954v5
- Date: Fri, 2 Feb 2024 13:56:21 GMT
- Title: I Prefer not to Say: Protecting User Consent in Models with Optional
Personal Data
- Authors: Tobias Leemann, Martin Pawelczyk, Christian Thomas Eberle, Gjergji
Kasneci
- Abstract summary: We show that the decision not to share data can be considered as information in itself that should be protected to respect users' privacy.
We formalize protection requirements for models which only use the information for which active user consent was obtained.
- Score: 20.238432971718524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We examine machine learning models in a setup where individuals have the
choice to share optional personal information with a decision-making system, as
seen in modern insurance pricing models. Some users consent to their data being
used whereas others object and keep their data undisclosed. In this work, we
show that the decision not to share data can be considered as information in
itself that should be protected to respect users' privacy. This observation
raises the overlooked problem of how to ensure that users who protect their
personal data do not suffer any disadvantages as a result. To address this
problem, we formalize protection requirements for models which only use the
information for which active user consent was obtained. This excludes implicit
information contained in the decision to share data or not. We offer the first
solution to this problem by proposing the notion of Protected User Consent
(PUC), which we prove to be loss-optimal under our protection requirement. We
observe that privacy and performance are not fundamentally at odds with each
other and that it is possible for a decision maker to benefit from additional
data while respecting users' consent. To learn PUC-compliant models, we devise
a model-agnostic data augmentation strategy with finite sample convergence
guarantees. Finally, we analyze the implications of PUC on challenging real
datasets, tasks, and models.
Related papers
- Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - $\alpha$-Mutual Information: A Tunable Privacy Measure for Privacy
Protection in Data Sharing [4.475091558538915]
This paper adopts Arimoto's $alpha$-Mutual Information as a tunable privacy measure.
We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection.
arXiv Detail & Related papers (2023-10-27T16:26:14Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Certified Data Removal in Sum-Product Networks [78.27542864367821]
Deleting the collected data is often insufficient to guarantee data privacy.
UnlearnSPN is an algorithm that removes the influence of single data points from a trained sum-product network.
arXiv Detail & Related papers (2022-10-04T08:22:37Z) - Group privacy for personalized federated learning [4.30484058393522]
Federated learning is a type of collaborative machine learning, where participating clients process their data locally, sharing only updates to the collaborative model.
We propose a method to provide group privacy guarantees exploiting some key properties of $d$-privacy.
arXiv Detail & Related papers (2022-06-07T15:43:45Z) - Personalized PATE: Differential Privacy for Machine Learning with
Individual Privacy Guarantees [1.2691047660244335]
We propose three novel methods to support training an ML model with different personalized privacy guarantees within the training data.
Our experiments show that our personalized privacy methods yield higher accuracy models than the non-personalized baseline.
arXiv Detail & Related papers (2022-02-21T20:16:27Z) - Causally Constrained Data Synthesis for Private Data Release [36.80484740314504]
Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data.
Prior works utilize differentially private data release mechanisms to provide formal privacy guarantees.
We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off.
arXiv Detail & Related papers (2021-05-27T13:46:57Z) - Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure
Dataset Release [52.504589728136615]
We develop a data poisoning method by which publicly released data can be minimally modified to prevent others from train-ing models on it.
We demonstrate the success of our approach onImageNet classification and on facial recognition.
arXiv Detail & Related papers (2021-02-16T19:12:34Z) - PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework
Based on Adversarial Learning [111.19576084222345]
This paper proposes a framework of Privacy-preserving Credit risk modeling based on Adversarial Learning (PCAL)
PCAL aims to mask the private information inside the original dataset, while maintaining the important utility information for the target prediction task performance.
Results indicate that PCAL can learn an effective, privacy-free representation from user data, providing a solid foundation towards privacy-preserving machine learning for credit risk analysis.
arXiv Detail & Related papers (2020-10-06T07:04:59Z) - Practical Privacy Preserving POI Recommendation [26.096197310800328]
We propose a novel Privacy preserving POI Recommendation (PriRec) framework.
PriRec keeps users' private raw data and models in users' own hands, and protects user privacy to a large extent.
We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with FM, PriRec achieves comparable or even better recommendation accuracy.
arXiv Detail & Related papers (2020-03-05T06:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.