Utility-aware Privacy-preserving Data Releasing
- URL: http://arxiv.org/abs/2005.04369v1
- Date: Sat, 9 May 2020 05:32:46 GMT
- Title: Utility-aware Privacy-preserving Data Releasing
- Authors: Di Zhuang and J. Morris Chang
- Abstract summary: We propose a two-step perturbation-based privacy-preserving data releasing framework.
First, certain predefined privacy and utility problems are learned from the public domain data.
We then leverage the learned knowledge to precisely perturb the data owners' data into privatized data.
- Score: 7.462336024223669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the big data era, more and more cloud-based data-driven applications are
developed that leverage individual data to provide certain valuable services
(the utilities). On the other hand, since the same set of individual data could
be utilized to infer the individual's certain sensitive information, it creates
new channels to snoop the individual's privacy. Hence it is of great importance
to develop techniques that enable the data owners to release privatized data,
that can still be utilized for certain premised intended purpose. Existing data
releasing approaches, however, are either privacy-emphasized (no consideration
on utility) or utility-driven (no guarantees on privacy). In this work, we
propose a two-step perturbation-based utility-aware privacy-preserving data
releasing framework. First, certain predefined privacy and utility problems are
learned from the public domain data (background knowledge). Later, our approach
leverages the learned knowledge to precisely perturb the data owners' data into
privatized data that can be successfully utilized for certain intended purpose
(learning to succeed), without jeopardizing certain predefined privacy
(training to fail). Extensive experiments have been conducted on Human Activity
Recognition, Census Income and Bank Marketing datasets to demonstrate the
effectiveness and practicality of our framework.
Related papers
- FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation [4.772368796656325]
In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments.
We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task.
arXiv Detail & Related papers (2024-10-30T02:41:26Z) - Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - A Summary of Privacy-Preserving Data Publishing in the Local Setting [0.6749750044497732]
Statistical Disclosure Control aims to minimize the risk of exposing confidential information by de-identifying it.
We outline the current privacy-preserving techniques employed in microdata de-identification, delve into privacy measures tailored for various disclosure scenarios, and assess metrics for information loss and predictive performance.
arXiv Detail & Related papers (2023-12-19T04:23:23Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Towards a Data Privacy-Predictive Performance Trade-off [2.580765958706854]
We evaluate the existence of a trade-off between data privacy and predictive performance in classification tasks.
Unlike previous literature, we confirm that the higher the level of privacy, the higher the impact on predictive performance.
arXiv Detail & Related papers (2022-01-13T21:48:51Z) - Adversarial representation learning for synthetic replacement of private
attributes [0.7619404259039281]
We propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample.
Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary.
arXiv Detail & Related papers (2020-06-14T22:07:19Z) - Beyond privacy regulations: an ethical approach to data usage in
transportation [64.86110095869176]
We describe how Federated Machine Learning can be applied to the transportation sector.
We see Federated Learning as a method that enables us to process privacy-sensitive data, while respecting customer's privacy.
arXiv Detail & Related papers (2020-04-01T15:10:12Z) - Privacy-Preserving Boosting in the Local Setting [17.375582978294105]
In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one.
In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information.
Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners.
arXiv Detail & Related papers (2020-02-06T04:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.