Attribute Privacy: Framework and Mechanisms
- URL: http://arxiv.org/abs/2009.04013v2
- Date: Tue, 11 May 2021 23:23:04 GMT
- Title: Attribute Privacy: Framework and Mechanisms
- Authors: Wanrong Zhang, Olga Ohrimenko, Rachel Cummings
- Abstract summary: We study the study of attribute privacy, where a data owner is concerned about revealing sensitive properties of a whole dataset during analysis.
We propose definitions to capture emphattribute privacy in two relevant cases where global attributes may need to be protected.
We provide two efficient mechanisms and one inefficient mechanism that satisfy attribute privacy for these settings.
- Score: 26.233612860653025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring the privacy of training data is a growing concern since many machine
learning models are trained on confidential and potentially sensitive data.
Much attention has been devoted to methods for protecting individual privacy
during analyses of large datasets. However in many settings, global properties
of the dataset may also be sensitive (e.g., mortality rate in a hospital rather
than presence of a particular patient in the dataset). In this work, we depart
from individual privacy to initiate the study of attribute privacy, where a
data owner is concerned about revealing sensitive properties of a whole dataset
during analysis. We propose definitions to capture \emph{attribute privacy} in
two relevant cases where global attributes may need to be protected: (1)
properties of a specific dataset and (2) parameters of the underlying
distribution from which dataset is sampled. We also provide two efficient
mechanisms and one inefficient mechanism that satisfy attribute privacy for
these settings. We base our results on a novel use of the Pufferfish framework
to account for correlations across attributes in the data, thus addressing "the
challenging problem of developing Pufferfish instantiations and algorithms for
general aggregate secrets" that was left open by \cite{kifer2014pufferfish}.
Related papers
- Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning [54.30994558765057]
The study pioneers a comprehensive privacy protection framework that safeguards image data privacy concurrently during data sharing and model publication.
We propose an interactive image privacy protection framework that utilizes generative machine learning models to modify image information at the attribute level.
Within this framework, we instantiate two modules: a differential privacy diffusion model for protecting attribute information in images and a feature unlearning algorithm for efficient updates of the trained model on the revised image dataset.
arXiv Detail & Related papers (2024-09-05T07:55:55Z) - MaSS: Multi-attribute Selective Suppression for Utility-preserving Data Transformation from an Information-theoretic Perspective [10.009178591853058]
We propose a formal information-theoretic definition for this utility-preserving privacy protection problem.
We design a data-driven learnable data transformation framework that is capable of suppressing sensitive attributes from target datasets.
Results demonstrate the effectiveness and generalizability of our method under various configurations.
arXiv Detail & Related papers (2024-05-23T18:35:46Z) - Synergizing Privacy and Utility in Data Analytics Through Advanced Information Theorization [2.28438857884398]
We introduce three sophisticated algorithms: a Noise-Infusion Technique tailored for high-dimensional image data, a Variational Autoencoder (VAE) for robust feature extraction and an Expectation Maximization (EM) approach optimized for structured data privacy.
Our methods significantly reduce mutual information between sensitive attributes and transformed data, thereby enhancing privacy.
The research contributes to the field by providing a flexible and effective strategy for deploying privacy-preserving algorithms across various data types.
arXiv Detail & Related papers (2024-04-24T22:58:42Z) - Privacy-Optimized Randomized Response for Sharing Multi-Attribute Data [1.1510009152620668]
We propose a privacy-optimized randomized response that guarantees the strongest privacy in sharing multi-attribute data.
We also present an efficient algorithm for constructing a near-optimal attribute mechanism.
Our methods provide significantly stronger privacy guarantees for the entire dataset than the existing method.
arXiv Detail & Related papers (2024-02-12T11:34:42Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Protecting Global Properties of Datasets with Distribution Privacy
Mechanisms [8.19841678851784]
We show how a distribution privacy framework can be applied to formalize such data confidentiality.
We then empirically evaluate the privacy-utility tradeoffs of these mechanisms and apply them against a practical property inference attack.
arXiv Detail & Related papers (2022-07-18T03:54:38Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Privacy-Preserving Public Release of Datasets for Support Vector Machine
Classification [14.095523601311374]
We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects.
The dataset is systematically obfuscated using an additive noise for privacy protection.
Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other.
arXiv Detail & Related papers (2019-12-29T03:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.