Re-pseudonymization Strategies for Smart Meter Data Are Not Robust to Deep Learning Profiling Attacks
- URL: http://arxiv.org/abs/2404.03948v1
- Date: Fri, 5 Apr 2024 08:27:36 GMT
- Title: Re-pseudonymization Strategies for Smart Meter Data Are Not Robust to Deep Learning Profiling Attacks
- Authors: Ana-Maria Cretu, Miruna Rusu, Yves-Alexandre de Montjoye,
- Abstract summary: We propose the first deep learning-based profiling attack against re-pseudonymized smart meter data.
Our results suggest that even frequent re-pseudonymization strategies can be reversed.
- Score: 9.061271587514215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Smart meters, devices measuring the electricity and gas consumption of a household, are currently being deployed at a fast rate throughout the world. The data they collect are extremely useful, including in the fight against climate change. However, these data and the information that can be inferred from them are highly sensitive. Re-pseudonymization, i.e., the frequent replacement of random identifiers over time, is widely used to share smart meter data while mitigating the risk of re-identification. We here show how, in spite of re-pseudonymization, households' consumption records can be pieced together with high accuracy in large-scale datasets. We propose the first deep learning-based profiling attack against re-pseudonymized smart meter data. Our attack combines neural network embeddings, which are used to extract features from weekly consumption records and are tailored to the smart meter identification task, with a nearest neighbor classifier. We evaluate six neural networks architectures as the embedding model. Our results suggest that the Transformer and CNN-LSTM architectures vastly outperform previous methods as well as other architectures, successfully identifying the correct household 73.4% of the time among 5139 households based on electricity and gas consumption records (54.5% for electricity only). We further show that the features extracted by the embedding model maintain their effectiveness when transferred to a set of users disjoint from the one used to train the model. Finally, we extensively evaluate the robustness of our results. Taken together, our results strongly suggest that even frequent re-pseudonymization strategies can be reversed, strongly limiting their ability to prevent re-identification in practice.
Related papers
- An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies [35.29595714883275]
We develop an efficient sketch-based approximation to the Nadaraya-Watson estimator.
Our sampling algorithm outperforms the baseline in terms of wall-clock time and accuracy on four datasets.
arXiv Detail & Related papers (2023-11-22T18:40:18Z) - Secure short-term load forecasting for smart grids with
transformer-based federated learning [0.0]
Electricity load forecasting is an essential task within smart grids to assist demand and supply balance.
Fine-grained load profiles can expose users' electricity consumption behaviors, which raises privacy and security concerns.
This paper presents a novel transformer-based deep learning approach with federated learning for short-term electricity load prediction.
arXiv Detail & Related papers (2023-10-26T15:27:55Z) - A Geometrical Approach to Evaluate the Adversarial Robustness of Deep
Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric.
We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z) - Hybrid Transformer-RNN Architecture for Household Occupancy Detection
Using Low-Resolution Smart Meter Data [8.486902848941872]
Digitalization of the energy system provides smart meter data that can be used for occupancy detection in a non-intrusive manner.
Deep learning techniques make it possible to infer occupancy from low-resolution smart meter data.
Our work is motivated to develop a privacy-aware and effective model for residential occupancy detection.
arXiv Detail & Related papers (2023-08-27T14:13:29Z) - Autoregressive Perturbations for Data Poisoning [54.205200221427994]
Data scraping from social media has led to growing concerns regarding unauthorized use of data.
Data poisoning attacks have been proposed as a bulwark against scraping.
We introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset.
arXiv Detail & Related papers (2022-06-08T06:24:51Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Active machine learning for spatio-temporal predictions using feature
embedding [0.537133760455631]
Active learning could contribute to solving environmental problems through improved critical-temporal predictions.
Here, we propose a novel batch AL method that fills this gap.
We encode and cluster features of candidate data points, and query the best data based on the distance of embedded features to their cluster centers.
arXiv Detail & Related papers (2020-12-08T12:55:29Z) - Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness [97.67477497115163]
We use mode connectivity to study the adversarial robustness of deep neural networks.
Our experiments cover various types of adversarial attacks applied to different network architectures and datasets.
Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
arXiv Detail & Related papers (2020-04-30T19:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.