Towards Trustworthy GUI Agents: A Survey
- URL: http://arxiv.org/abs/2503.23434v1
- Date: Sun, 30 Mar 2025 13:26:00 GMT
- Title: Towards Trustworthy GUI Agents: A Survey
- Authors: Yucheng Shi, Wenhao Yu, Wenlin Yao, Wenhu Chen, Ninghao Liu,
- Abstract summary: This survey examines the trustworthiness of GUI agents in five critical dimensions.<n>We identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making.<n>As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential.
- Score: 64.6445117343499
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: GUI agents, powered by large foundation models, can interact with digital interfaces, enabling various applications in web automation, mobile navigation, and software testing. However, their increasing autonomy has raised critical concerns about their security, privacy, and safety. This survey examines the trustworthiness of GUI agents in five critical dimensions: security vulnerabilities, reliability in dynamic environments, transparency and explainability, ethical considerations, and evaluation methodologies. We also identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making, and a lack of realistic evaluation benchmarks. These issues not only hinder real-world deployment but also call for comprehensive mitigation strategies beyond task success. As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential. This survey provides a foundation for advancing trustworthy GUI agents through systematic understanding and future research.
Related papers
- Toward a Human-Centered Evaluation Framework for Trustworthy LLM-Powered GUI Agents [21.722763588466922]
This position paper identifies three key risks of GUI agents and examines how they differ from traditional GUI automation and general autonomous agents.
Despite these risks, existing evaluations focus primarily on performance, leaving privacy and security assessments largely unexplored.
To address these gaps, we advocate for a human-centered evaluation framework that incorporates risk assessments, enhances user awareness through in-context consent, and embeds privacy and security considerations into GUI agent design and evaluation.
arXiv Detail & Related papers (2025-04-24T20:51:20Z) - A Survey on (M)LLM-Based GUI Agents [62.57899977018417]
Graphical User Interface (GUI) Agents have emerged as a transformative paradigm in human-computer interaction.
Recent advances in large language models and multimodal learning have revolutionized GUI automation across desktop, mobile, and web platforms.
This survey identifies key technical challenges, including accurate element localization, effective knowledge retrieval, long-horizon planning, and safety-aware execution control.
arXiv Detail & Related papers (2025-03-27T17:58:31Z) - AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement [73.0700818105842]
We introduce AISafetyLab, a unified framework and toolkit that integrates representative attack, defense, and evaluation methodologies for AI safety.<n> AISafetyLab features an intuitive interface that enables developers to seamlessly apply various techniques.<n>We conduct empirical studies on Vicuna, analyzing different attack and defense strategies to provide valuable insights into their comparative effectiveness.
arXiv Detail & Related papers (2025-02-24T02:11:52Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.50078821423793]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks [22.154001025679896]
Embodied AI systems, including robots and autonomous vehicles, are increasingly integrated into real-world applications.<n>These vulnerabilities manifest through sensor spoofing, adversarial attacks, and failures in task and motion planning.
arXiv Detail & Related papers (2025-02-18T03:38:07Z) - Safety at Scale: A Comprehensive Survey of Large Model Safety [298.05093528230753]
We present a comprehensive taxonomy of safety threats to large models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats.<n>We identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.
arXiv Detail & Related papers (2025-02-02T05:14:22Z) - GUI Agents: A Survey [129.94551809688377]
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction.<n>Motivated by the growing interest and fundamental importance of GUI agents, we provide a comprehensive survey that categorizes their benchmarks, evaluation metrics, architectures, and training methods.
arXiv Detail & Related papers (2024-12-18T04:48:28Z) - Deep Learning Model Security: Threats and Defenses [25.074630770554105]
Deep learning has transformed AI applications but faces critical security challenges.<n>This survey examines these vulnerabilities, detailing their mechanisms and impact on model integrity and confidentiality.<n>The survey concludes with future directions, emphasizing automated defenses, zero-trust architectures, and the security challenges of large AI models.
arXiv Detail & Related papers (2024-12-12T06:04:20Z) - ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language Models [0.46873264197900916]
This paper explores the potential application of Large Language Models (LLMs) to enhance the assessment of software vulnerabilities.
We develop three variants of ChatNVD, utilizing three prominent LLMs: GPT-4o mini by OpenAI, Llama 3 by Meta, and Gemini 1.5 Pro by Google.
To evaluate their efficacy, we conduct a comparative analysis of these models using a comprehensive questionnaire comprising common security vulnerability questions.
arXiv Detail & Related papers (2024-12-06T03:45:49Z) - Inherently Interpretable and Uncertainty-Aware Models for Online Learning in Cyber-Security Problems [0.22499166814992438]
We propose a novel pipeline for online supervised learning problems in cyber-security.
Our approach aims to balance predictive performance with transparency.
This work contributes to the growing field of interpretable AI.
arXiv Detail & Related papers (2024-11-14T12:11:08Z) - New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook [54.24701201956833]
Security and privacy issues have undermined users' confidence in pre-trained models.
Current literature lacks a clear taxonomy of emerging attacks and defenses for pre-trained models.
This taxonomy categorizes attacks and defenses into No-Change, Input-Change, and Model-Change approaches.
arXiv Detail & Related papers (2024-11-12T10:15:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.