Guided Diffusion Model for Sensor Data Obfuscation
- URL: http://arxiv.org/abs/2412.14499v1
- Date: Thu, 19 Dec 2024 03:47:12 GMT
- Title: Guided Diffusion Model for Sensor Data Obfuscation
- Authors: Xin Yang, Omid Ardakanian,
- Abstract summary: PrivDiffuser is a novel data obfuscation technique based on a denoising diffusion model.
We show that PrivDiffuser yields a better privacy-utility trade-off than the state-of-the-art obfuscation model.
- Score: 4.91258288207688
- License:
- Abstract: Sensor data collected by Internet of Things (IoT) devices carries detailed information about individuals in their vicinity. Sharing this data with a semi-trusted service provider may compromise the individuals' privacy, as sensitive information can be extracted by powerful machine learning models. Data obfuscation empowered by generative models is a promising approach to generate synthetic sensor data such that the useful information contained in the original data is preserved and the sensitive information is obscured. This newly generated data will then be shared with the service provider instead of the original sensor data. In this work, we propose PrivDiffuser, a novel data obfuscation technique based on a denoising diffusion model that attains a superior trade-off between data utility and privacy through effective guidance techniques. Specifically, we extract latent representations that contain information about public and private attributes from sensor data to guide the diffusion model, and impose mutual information-based regularization when learning the latent representations to alleviate the entanglement of public and private attributes, thereby increasing the effectiveness of guidance. Evaluation on three real-world datasets containing different sensing modalities reveals that PrivDiffuser yields a better privacy-utility trade-off than the state-of-the-art obfuscation model, decreasing the utility loss by up to $1.81\%$ and the privacy loss by up to $3.42\%$. Moreover, we showed that users with diverse privacy needs can use PrivDiffuser to protect their privacy without having to retrain the model.
Related papers
- FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation [4.772368796656325]
In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments.
We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task.
arXiv Detail & Related papers (2024-10-30T02:41:26Z) - Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence.
We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context.
We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - $\alpha$-Mutual Information: A Tunable Privacy Measure for Privacy
Protection in Data Sharing [4.475091558538915]
This paper adopts Arimoto's $alpha$-Mutual Information as a tunable privacy measure.
We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection.
arXiv Detail & Related papers (2023-10-27T16:26:14Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - ConfounderGAN: Protecting Image Data Privacy with Causal Confounder [85.6757153033139]
We propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners.
Experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets.
arXiv Detail & Related papers (2022-12-04T08:49:14Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Blinder: End-to-end Privacy Protection in Sensing Systems via
Personalized Federated Learning [5.803565897482636]
We propose a sensor data anonymization model that is trained on decentralized data and strikes a desirable trade-off between data utility and privacy.
Our anonymization model, dubbed Blinder, is based on a variational autoencoder and one or multiple discriminator networks trained in an adversarial fashion.
arXiv Detail & Related papers (2022-09-24T16:22:43Z) - Privacy-Utility Trades in Crowdsourced Signal Map Obfuscation [20.58763760239068]
Crowdsource celluar signal strength measurements can be used to generate signal maps to improve network performance.
We consider obfuscating such data before the data leaves the mobile device.
Our evaluation results, based on multiple, diverse, real-world signal map datasets, demonstrate the feasibility of concurrently achieving adequate privacy and utility.
arXiv Detail & Related papers (2022-01-13T03:46:22Z) - Adversarial representation learning for synthetic replacement of private
attributes [0.7619404259039281]
We propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample.
Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary.
arXiv Detail & Related papers (2020-06-14T22:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.