Related papers: Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach

Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach

URL: http://arxiv.org/abs/2011.08315v3
Date: Fri, 27 Aug 2021 21:11:42 GMT
Title: Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach
Authors: Omid Hajihassani, Omid Ardakanian, Hamzeh Khazaei
Abstract summary: In this paper, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data. We show that it can anonymize data in real time on resource-constrained edge devices.
Score: 4.920145245773581
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The abundance of data collected by sensors in Internet of Things (IoT) devices, and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this paper, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.

Related papers

Feature Shift Localization Network [51.33484517421393]
We introduce a neural network that can localize feature shifts in large and high-dimensional datasets in a fast and accurate manner.<n>The network, trained with a large number of datasets, learns to extract the statistical properties of the datasets and can localize feature shifts without the need for re-training.
arXiv Detail & Related papers (2025-06-10T15:27:32Z)
Guided Diffusion Model for Sensor Data Obfuscation [4.91258288207688]
PrivDiffuser is a novel data obfuscation technique based on a denoising diffusion model. We show that PrivDiffuser yields a better privacy-utility trade-off than the state-of-the-art obfuscation model.
arXiv Detail & Related papers (2024-12-19T03:47:12Z)
A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z)
Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation [32.83436754714798]
This work considers the using the features of $textitneural tangent kernels (NTKs)$, more precisely $textitempirical$ NTKs (e-NTKs) We find that, perhaps surprisingly, the expressiveness of the untrained e-NTK features is comparable to that of the features taken from pre-trained perceptual features using public data.
arXiv Detail & Related papers (2023-03-03T03:00:49Z)
DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors. We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis. We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z)
Uncertainty-Autoencoder-Based Privacy and Utility Preserving Data Type Conscious Transformation [3.7315964084413173]
We propose an adversarial learning framework that deals with the privacy-utility tradeoff problem under two conditions. Under data-type ignorant conditions, the privacy mechanism provides a one-hot encoding of categorical features, representing exactly one class. Under data-type aware conditions, the categorical variables are represented by a collection of scores, one for each class.
arXiv Detail & Related papers (2022-05-04T08:40:15Z)
Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on. We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z)
Sensitivity analysis in differentially private machine learning using hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis. This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data. Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z)
Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data. perturbations chosen independently at every agent, resulting in a significant performance loss. We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space. We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step. We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
Privacy Enhancing Machine Learning via Removal of Unwanted Dependencies [21.97951347784442]
This paper studies new variants of supervised and adversarial learning methods, which remove the sensitive information in the data before they are sent out for a particular application. The explored methods optimize privacy preserving feature mappings and predictive models simultaneously in an end-to-end fashion. Experimental results on mobile sensing and face datasets demonstrate that our models can successfully maintain the utility performances of predictive models while causing sensitive predictions to perform poorly.
arXiv Detail & Related papers (2020-07-30T19:55:10Z)
Privacy-Preserving Distributed Learning in the Analog Domain [23.67685616088422]
We consider the problem of distributed learning over data while keeping it private from the computational servers. We propose a novel algorithm to solve the problem when data is in the analog domain. We show how the proposed framework can be adopted to do computation tasks when data is represented using floating-point numbers.
arXiv Detail & Related papers (2020-07-17T07:56:39Z)
PrivGen: Preserving Privacy of Sequences Through Data Generation [14.579475552088688]
Sequential data can serve as a basis for research that will lead to improved processes. Access and use of such data is usually limited or not permitted at all due to concerns about violating user privacy. We propose PrivGen, an innovative method for generating data that maintains patterns and characteristics of the source data.
arXiv Detail & Related papers (2020-02-23T05:43:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.