Related papers: Where you go is who you are -- A study on machine learning based semantic privacy attacks

Where you go is who you are -- A study on machine learning based semantic privacy attacks

URL: http://arxiv.org/abs/2310.17643v1
Date: Thu, 26 Oct 2023 17:56:50 GMT
Title: Where you go is who you are -- A study on machine learning based semantic privacy attacks
Authors: Nina Wiedemann, Ourania Kounadi, Martin Raubal, Krzysztof Janowicz
Abstract summary: We present a systematic analysis of two attack scenarios, namely location categorization and user profiling. Experiments on the Foursquare dataset and tracking data demonstrate the potential for abuse of high-quality spatial information. Our findings point out the risks of ever-growing databases of tracking data and spatial context data.
Score: 3.259843027596329
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Concerns about data privacy are omnipresent, given the increasing usage of digital applications and their underlying business model that includes selling user data. Location data is particularly sensitive since they allow us to infer activity patterns and interests of users, e.g., by categorizing visited locations based on nearby points of interest (POI). On top of that, machine learning methods provide new powerful tools to interpret big data. In light of these considerations, we raise the following question: What is the actual risk that realistic, machine learning based privacy attacks can obtain meaningful semantic information from raw location data, subject to inaccuracies in the data? In response, we present a systematic analysis of two attack scenarios, namely location categorization and user profiling. Experiments on the Foursquare dataset and tracking data demonstrate the potential for abuse of high-quality spatial information, leading to a significant privacy loss even with location inaccuracy of up to 200m. With location obfuscation of more than 1 km, spatial information hardly adds any value, but a high privacy risk solely from temporal information remains. The availability of public context data such as POIs plays a key role in inference based on spatial information. Our findings point out the risks of ever-growing databases of tracking data and spatial context data, which policymakers should consider for privacy regulations, and which could guide individuals in their personal location protection measures.

Related papers

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage [77.83757117924995]
We propose a new framework that evaluates re-identification attacks to quantify individual privacy risks upon data release. Our approach shows that seemingly innocuous auxiliary information can be used to infer sensitive attributes like age or substance use history from sanitized data.
arXiv Detail & Related papers (2025-04-28T01:16:27Z)
Investigating Vulnerabilities of GPS Trip Data to Trajectory-User Linking Attacks [49.1574468325115]
We propose a novel attack to reconstruct user identifiers in GPS trip datasets consisting of single trips. We show that the risk of re-identification is significant even when personal identifiers have been removed. Further investigations indicate that users who frequently visit locations that are only visited by a small number of others tend to be more vulnerable to re-identification.
arXiv Detail & Related papers (2025-02-12T08:54:49Z)
A Survey on Differential Privacy for SpatioTemporal Data in Transportation Research [0.9790236766474202]
In transportation, we are seeing a surge in intemporal data collection. Recent developments in differential privacy in the context of such data have led to research in applied privacy. To address the need for such data in research and inference without exposing private information, significant work has been proposed.
arXiv Detail & Related papers (2024-07-18T03:19:29Z)
Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy. Within our study, we conducted expert interviews to gain insights into practices in the field. We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z)
Measuring Privacy Loss in Distributed Spatio-Temporal Data [26.891854386652266]
We propose an alternative privacy loss against location reconstruction attacks by an informed adversary. Our experiments on real and synthetic data demonstrate that our privacy loss better reflects our intuitions on individual privacy violation in the distributed setting.
arXiv Detail & Related papers (2024-02-18T09:53:14Z)
Privacy risk in GeoData: A survey [3.7228963206288967]
We analyse different geomasking techniques proposed to protect individuals' privacy in geodata. We propose a taxonomy to characterise these techniques across various dimensions. Our proposed taxonomy serves as a practical resource for data custodians, offering them a means to navigate the extensive array of existing privacy mechanisms.
arXiv Detail & Related papers (2024-02-06T00:55:06Z)
Where have you been? A Study of Privacy Risk for Point-of-Interest Recommendation [20.526071564917274]
Mobility data can be used to build machine learning (ML) models for location-based services (LBS) However, the convenience comes with the risk of privacy leakage since this type of data might contain sensitive information related to user identities, such as home/work locations. We design a privacy attack suite containing data extraction and membership inference attacks tailored for point-of-interest (POI) recommendation models.
arXiv Detail & Related papers (2023-10-28T06:17:52Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z)
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving. We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy. We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z)
PRIVEE: A Visual Analytic Workflow for Proactive Privacy Risk Inspection of Open Data [3.2136309934080867]
Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. We develop a visual analytic solution that enables data defenders to gain awareness about the disclosure risks in local, joinable data neighborhoods. We use this problem and domain characterization to develop a set of visual analytic interventions as a defense mechanism.
arXiv Detail & Related papers (2022-08-12T19:57:09Z)
Releasing survey microdata with exact cluster locations and additional privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards. Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z)
PGLP: Customizable and Rigorous Location Privacy through Policy Graph [68.3736286350014]
We propose a new location privacy notion called PGLP, which provides a rich interface to release private locations with customizable and rigorous privacy guarantee. Specifically, we formalize a user's location privacy requirements using a textitlocation policy graph, which is expressive and customizable. Third, we design a private location trace release framework that pipelines the detection of location exposure, policy graph repair, and private trajectory release with customizable and rigorous location privacy.
arXiv Detail & Related papers (2020-05-04T04:25:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.