GUIGuard: Toward a General Framework for Privacy-Preserving GUI Agents
- URL: http://arxiv.org/abs/2601.18842v2
- Date: Thu, 29 Jan 2026 01:37:29 GMT
- Title: GUIGuard: Toward a General Framework for Privacy-Preserving GUI Agents
- Authors: Yanxi Wang, Zhiling Zhang, Wenbo Zhou, Weiming Zhang, Jie Zhang, Qiannan Zhu, Yu Shi, Shuxin Zheng, Jiyan He,
- Abstract summary: GUIs expose richer, more accessible private information, and privacy risks depend on interaction trajectories across sequential scenes.<n>We propose a three-stage framework for privacy-preserving GUI agents: privacy recognition, privacy protection, and task execution under protection.<n>Our results highlight privacy recognition as a critical bottleneck for practical GUI agents.
- Score: 38.42792282309646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: GUI agents enable end-to-end automation through direct perception of and interaction with on-screen interfaces. However, these agents frequently access interfaces containing sensitive personal information, and screenshots are often transmitted to remote models, creating substantial privacy risks. These risks are particularly severe in GUI workflows: GUIs expose richer, more accessible private information, and privacy risks depend on interaction trajectories across sequential scenes. We propose GUIGuard, a three-stage framework for privacy-preserving GUI agents: (1) privacy recognition, (2) privacy protection, and (3) task execution under protection. We further construct GUIGuard-Bench, a cross-platform benchmark with 630 trajectories and 13,830 screenshots, annotated with region-level privacy grounding and fine-grained labels of risk level, privacy category, and task necessity. Evaluations reveal that existing agents exhibit limited privacy recognition, with state-of-the-art models achieving only 13.3% accuracy on Android and 1.4% on PC. Under privacy protection, task-planning semantics can still be maintained, with closed-source models showing stronger semantic consistency than open-source ones. Case studies on MobileWorld show that carefully designed protection strategies achieve higher task accuracy while preserving privacy. Our results highlight privacy recognition as a critical bottleneck for practical GUI agents. Project: https://futuresis.github.io/GUIGuard-page/
Related papers
- Anonymization-Enhanced Privacy Protection for Mobile GUI Agents: Available but Invisible [12.742325129012576]
Mobile Graphical User Interface (GUI) agents have demonstrated strong capabilities in automating complex smartphone tasks.<n>We propose anonymization-based privacy protection framework that enforces the principle of available-but-invisible access to sensitive data.<n>Our system detects sensitive UI content using a PII-aware recognition model and replaces it with deterministic, type-preserving placeholders.
arXiv Detail & Related papers (2026-02-08T15:50:04Z) - Beyond Blanket Masking: Examining Granularity for Privacy Protection in Images Captured by Blind and Low Vision Users [23.61740342584077]
We propose FiGPriv, a fine-grained privacy protection framework that selectively masks only high-risk private information.<n>Our approach integrates fine-grained segmentation with a data-driven risk scoring mechanism.<n>We evaluate our framework using the BIV-Priv-Seg dataset and show that FiG-Priv preserves +26% of image content.
arXiv Detail & Related papers (2025-08-12T17:56:36Z) - VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation [73.92237451442752]
This work reveals that the visual grounding of GUI agent-mapping textual plans to GUI elements can introduce vulnerabilities.<n>With backdoor attack targeting visual grounding, the agent's behavior can be compromised even when given correct task-solving plans.<n>We propose VisualTrap, a method that can hijack the grounding by misleading the agent to locate textual plans to trigger locations instead of the intended targets.
arXiv Detail & Related papers (2025-07-09T14:36:00Z) - The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections [21.322212760700957]
A Large Language Model (LLM) powered GUI agent is a specialized autonomous system that performs tasks on the user's behalf according to high-level instructions.<n>To complete real-world tasks, such as filling forms or booking services, GUI agents often need to process and act on sensitive user data.<n>These attacks often exploit the discrepancy between visual saliency for agents and human users.
arXiv Detail & Related papers (2025-04-15T15:21:09Z) - PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models [51.458089902581456]
We introduce PersGuard, a novel backdoor-based approach that prevents malicious personalization of specific images.<n>Our method significantly outperforms existing techniques, offering a more robust solution for privacy and copyright protection.
arXiv Detail & Related papers (2025-02-22T09:47:55Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Privacy-preserving Optics for Enhancing Protection in Face De-identification [60.110274007388135]
We propose a hardware-level face de-identification method to solve this vulnerability.
We also propose an anonymization framework that generates a new face using the privacy-preserving image, face heatmap, and a reference face image from a public dataset as input.
arXiv Detail & Related papers (2024-03-31T19:28:04Z) - {A New Hope}: Contextual Privacy Policies for Mobile Applications and An
Approach Toward Automated Generation [19.578130824867596]
The aim of contextual privacy policies ( CPPs) is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the application's graphical user interfaces (GUIs)
In this paper, we first formulate CPP in mobile application scenario, and then present a novel multimodal framework, named SeePrivacy, specifically designed to automatically generate CPPs for mobile applications.
A human evaluation shows that 77% of the extracted privacy policy segments were perceived as well-aligned with the detected contexts.
arXiv Detail & Related papers (2024-02-22T13:32:33Z) - Can Language Models be Instructed to Protect Personal Information? [30.187731765653428]
We introduce PrivQA -- a benchmark to assess the privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario.
We find that adversaries can easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs.
We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections.
arXiv Detail & Related papers (2023-10-03T17:30:33Z) - SPAct: Self-supervised Privacy Preservation for Action Recognition [73.79886509500409]
Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset.
Recent developments of self-supervised learning (SSL) have unleashed the untapped potential of the unlabeled data.
We present a novel training framework which removes privacy information from input video in a self-supervised manner without requiring privacy labels.
arXiv Detail & Related papers (2022-03-29T02:56:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.