GeoShield: Safeguarding Geolocation Privacy from Vision-Language Models via Adversarial Perturbations
- URL: http://arxiv.org/abs/2508.03209v1
- Date: Tue, 05 Aug 2025 08:37:06 GMT
- Title: GeoShield: Safeguarding Geolocation Privacy from Vision-Language Models via Adversarial Perturbations
- Authors: Xinwei Liu, Xiaojun Jia, Yuan Xun, Simeng Qin, Xiaochun Cao,
- Abstract summary: Vision-Language Models (VLMs) can infer users' locations from public shared images, posing a substantial risk to geoprivacy.<n>We propose GeoShield, a novel adversarial framework designed for robust geoprivacy protection in real-world scenarios.
- Score: 48.78781663571235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-Language Models (VLMs) such as GPT-4o now demonstrate a remarkable ability to infer users' locations from public shared images, posing a substantial risk to geoprivacy. Although adversarial perturbations offer a potential defense, current methods are ill-suited for this scenario: they often perform poorly on high-resolution images and low perturbation budgets, and may introduce irrelevant semantic content. To address these limitations, we propose GeoShield, a novel adversarial framework designed for robust geoprivacy protection in real-world scenarios. GeoShield comprises three key modules: a feature disentanglement module that separates geographical and non-geographical information, an exposure element identification module that pinpoints geo-revealing regions within an image, and a scale-adaptive enhancement module that jointly optimizes perturbations at both global and local levels to ensure effectiveness across resolutions. Extensive experiments on challenging benchmarks show that GeoShield consistently surpasses prior methods in black-box settings, achieving strong privacy protection with minimal impact on visual or semantic quality. To our knowledge, this work is the first to explore adversarial perturbations for defending against geolocation inference by advanced VLMs, providing a practical and effective solution to escalating privacy concerns.
Related papers
- Evaluation of Geolocation Capabilities of Multimodal Large Language Models and Analysis of Associated Privacy Risks [9.003350058345442]
MLLMs are capable of inferring the geographic location of images based solely on visual content.<n>This poses serious risks of privacy invasion, including doxxing, surveillance, and other security threats.<n>The most advanced visual models can successfully localize the origin of street-level imagery with up to $49%$ accuracy within a 1-kilometer radius.
arXiv Detail & Related papers (2025-06-30T03:05:30Z) - Network Hexagons Under Attack: Secure Crowdsourcing of Geo-Referenced Data [0.0]
We propose an enhanced security architecture that combines public key infrastructure (PKI) with ephemeral certificates.<n>Our solution guarantees user and device anonymity through randomized key rotation and adaptive geospatial resolution.<n>Our results show that it is possible to achieve the required level of security without increasing latency by more than 25% or reducing the throughput by more than 7%.
arXiv Detail & Related papers (2025-06-05T21:27:10Z) - Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models [37.18986847375693]
Adversaries can infer sensitive geolocation information from user-generated images.<n>DoxBench is a curated dataset of 500 real-world images reflecting diverse privacy scenarios.<n>Our findings highlight the urgent need to reassess inference-time privacy risks in MLRMs.
arXiv Detail & Related papers (2025-04-27T22:26:45Z) - Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions [49.546479320670464]
This paper introduces specialized metrics for benchmarking the spatial robustness of segmentation models.<n>We propose region-aware multi-attack adversarial analysis, a method that enables a deeper understanding of model robustness.<n>The results reveal that models respond to these two types of threats differently.
arXiv Detail & Related papers (2025-04-02T11:37:39Z) - Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - Image-Based Geolocation Using Large Vision-Language Models [19.071551941682063]
We introduce tool, an innovative framework that significantly enhances image-based geolocation accuracy.
tool employs a systematic chain-of-thought (CoT) approach, mimicking human geoguessing strategies.
It achieves an impressive average score of 4550.5 in the GeoGuessr game, with an 85.37% win rate, and delivers highly precise geolocation predictions.
arXiv Detail & Related papers (2024-08-18T13:39:43Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition [18.93148005536135]
Existing approaches often suffer from issues arising through obfuscation being applied globally and a lack of interpretability.
We propose a solution: Human selected privacy templates that yield interpretability by design, an obfuscation scheme that selectively hides attributes and also induces temporal consistency.
arXiv Detail & Related papers (2024-03-19T13:17:26Z) - Progressive Feature Self-reinforcement for Weakly Supervised Semantic
Segmentation [55.69128107473125]
We propose a single-stage approach for Weakly Supervised Semantic (WSSS) with image-level labels.
We adaptively partition the image content into deterministic regions (e.g., confident foreground and background) and uncertain regions (e.g., object boundaries and misclassified categories) for separate processing.
Building upon this, we introduce a complementary self-enhancement method that constrains the semantic consistency between these confident regions and an augmented image with the same class labels.
arXiv Detail & Related papers (2023-12-14T13:21:52Z) - Shielding the Unseen: Privacy Protection through Poisoning NeRF with
Spatial Deformation [59.302770084115814]
We introduce an innovative method of safeguarding user privacy against the generative capabilities of Neural Radiance Fields (NeRF) models.
Our novel poisoning attack method induces changes to observed views that are imperceptible to the human eye, yet potent enough to disrupt NeRF's ability to accurately reconstruct a 3D scene.
We extensively test our approach on two common NeRF benchmark datasets consisting of 29 real-world scenes with high-quality images.
arXiv Detail & Related papers (2023-10-04T19:35:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.