Statistical Feature-based Personal Information Detection in Mobile
Network Traffic
- URL: http://arxiv.org/abs/2112.12346v1
- Date: Thu, 23 Dec 2021 04:01:16 GMT
- Title: Statistical Feature-based Personal Information Detection in Mobile
Network Traffic
- Authors: Shuang Zhao, Shuhui Chen, Ziling Wei
- Abstract summary: In this paper, statistical features of personal information are designed to depict the occurrence patterns of personal information in the traffic.
A detector is trained based on machine learning algorithms to discover potential personal information with similar patterns.
As far as we know, this is the first work that detects personal information based on statistical features.
- Score: 13.568975395946433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the popularity of smartphones, mobile applications (apps) have
penetrated the daily life of people. Although apps provide rich
functionalities, they also access a large amount of personal information
simultaneously. As a result, privacy concerns are raised. To understand what
personal information the apps collect, many solutions are presented to detect
privacy leaks in apps. Recently, the traffic monitoring-based privacy leak
detection method has shown promising performance and strong scalability.
However, it still has some shortcomings. Firstly, it suffers from detecting the
leakage of personal information with obfuscation. Secondly, it cannot discover
the privacy leaks of undefined type. Aiming at solving the above problems, a
new personal information detection method based on traffic monitoring is
proposed in this paper. In this paper, statistical features of personal
information are designed to depict the occurrence patterns of personal
information in the traffic, including local patterns and global patterns. Then
a detector is trained based on machine learning algorithms to discover
potential personal information with similar patterns. Since the statistical
features are independent of the value and type of personal information, the
trained detector is capable of identifying various types of privacy leaks and
obfuscated privacy leaks. As far as we know, this is the first work that
detects personal information based on statistical features. Finally, the
experimental results show that the proposed method could achieve better
performance than the state-of-the-art.
Related papers
- Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Can Language Models be Instructed to Protect Personal Information? [30.187731765653428]
We introduce PrivQA -- a benchmark to assess the privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario.
We find that adversaries can easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs.
We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections.
arXiv Detail & Related papers (2023-10-03T17:30:33Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - A General Framework for Auditing Differentially Private Machine Learning [27.99806936918949]
We present a framework to statistically audit the privacy guarantee conferred by a differentially private machine learner in practice.
Our work develops a general methodology to empirically evaluate the privacy of differentially private machine learning implementations.
arXiv Detail & Related papers (2022-10-16T21:34:18Z) - The Privacy Onion Effect: Memorization is Relative [76.46529413546725]
We show an Onion Effect of memorization: removing the "layer" of outlier points that are most vulnerable exposes a new layer of previously-safe points to the same attack.
It suggests that privacy-enhancing technologies such as machine unlearning could actually harm the privacy of other users.
arXiv Detail & Related papers (2022-06-21T15:25:56Z) - SPAct: Self-supervised Privacy Preservation for Action Recognition [73.79886509500409]
Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset.
Recent developments of self-supervised learning (SSL) have unleashed the untapped potential of the unlabeled data.
We present a novel training framework which removes privacy information from input video in a self-supervised manner without requiring privacy labels.
arXiv Detail & Related papers (2022-03-29T02:56:40Z) - Active Privacy-Utility Trade-off Against Inference in Time-Series Data
Sharing [29.738666406095074]
We consider a user releasing her data containing personal information in return of a service from an honest-but-curious service provider (SP)
We formulate both problems as partially observable Markov decision processes (POMDPs) and numerically solve them by advantage actor-critic (A2C) deep reinforcement learning (DRL)
We evaluate the privacy-utility trade-off (PUT) of the proposed policies on both the synthetic data and smoking activity dataset, and show their validity by testing the activity detection accuracy of the SP modeled by a long short-term memory (LSTM) neural network.
arXiv Detail & Related papers (2022-02-11T18:57:31Z) - Towards a Data Privacy-Predictive Performance Trade-off [2.580765958706854]
We evaluate the existence of a trade-off between data privacy and predictive performance in classification tasks.
Unlike previous literature, we confirm that the higher the level of privacy, the higher the impact on predictive performance.
arXiv Detail & Related papers (2022-01-13T21:48:51Z) - Learning Language and Multimodal Privacy-Preserving Markers of Mood from
Mobile Data [74.60507696087966]
Mental health conditions remain underdiagnosed even in countries with common access to advanced medical care.
One promising data source to help monitor human behavior is daily smartphone usage.
We study behavioral markers of daily mood using a recent dataset of mobile behaviors from adolescent populations at high risk of suicidal behaviors.
arXiv Detail & Related papers (2021-06-24T17:46:03Z) - Learning With Differential Privacy [3.618133010429131]
Differential privacy comes to the rescue with a proper promise of protection against leakage.
It uses a randomized response technique at the time of collection of the data which promises strong privacy with better utility.
arXiv Detail & Related papers (2020-06-10T02:04:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.