Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels
- URL: http://arxiv.org/abs/2510.27140v2
- Date: Thu, 06 Nov 2025 03:52:08 GMT
- Title: Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels
- Authors: Chenghao Du, Quanfeng Huang, Tingxuan Tang, Zihao Wang, Adwait Nadkarni, Yue Xiao,
- Abstract summary: Large Language Models (LLMs) have transformed software development, enabling AI-powered applications that promise to automate tasks across diverse apps and vectors.<n>Yet, the security implications of deploying such agents in adversarial mobile environments remain poorly understood.<n>We present the first systematic study of security risks in mobile LLM agents.
- Score: 20.06531064708478
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to automate tasks across diverse apps and workflows. Yet, the security implications of deploying such agents in adversarial mobile environments remain poorly understood. In this paper, we present the first systematic study of security risks in mobile LLM agents. We design and evaluate a suite of adversarial case studies, ranging from opportunistic manipulations such as pop-up advertisements to advanced, end-to-end workflows involving malware installation and cross-app data exfiltration. Our evaluation covers eight state-of-the-art mobile agents across three architectures, with over 2,000 adversarial and paired benign trials. The results reveal systemic vulnerabilities: low-barrier vectors such as fraudulent ads succeed with over 80% reliability, while even workflows requiring the circumvention of operating-system warnings, such as malware installation, are consistently completed by advanced multi-app agents. By mapping these attacks to the MITRE ATT&CK Mobile framework, we uncover novel privilege-escalation and persistence pathways unique to LLM-driven automation. Collectively, our findings provide the first end-to-end evidence that mobile LLM agents are exploitable in realistic adversarial settings, where untrusted third-party channels (e.g., ads, embedded webviews, cross-app notifications) are an inherent part of the mobile ecosystem.
Related papers
- Okara: Detection and Attribution of TLS Man-in-the-Middle Vulnerabilities in Android Apps with Foundation Models [3.9807330903947378]
Transport Layer Security (TLS) is fundamental to secure online communication.<n>Man-in-the-Middle (MitM) attacks remain a pervasive threat in Android apps.<n>We present Okara, a framework that automates the detection and attribution of MitM Vulnerabilities.
arXiv Detail & Related papers (2026-01-30T09:49:09Z) - Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography [77.44136793431893]
We propose a novel jailbreak paradigm that introduces dual steganography to covertly embed malicious queries into benign-looking images.<n>Our Odysseus successfully jailbreaks several pioneering and realistic MLLM-integrated systems, achieving up to 99% attack success rate.
arXiv Detail & Related papers (2025-12-23T08:53:36Z) - OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows [77.95511352806261]
Computer-using agents powered by Vision-Language Models (VLMs) have demonstrated human-like capabilities in operating digital environments like mobile platforms.<n>We propose OS-Sentinel, a novel hybrid safety detection framework that combines a Formal Verifier for detecting explicit system-level violations with a Contextual Judge for assessing contextual risks and agent actions.
arXiv Detail & Related papers (2025-10-28T13:22:39Z) - AI Kill Switch for malicious web-based LLM agent [4.144114850905779]
We propose an AI Kill Switch technique that can halt the operation of malicious web-based LLM agents.<n>Key idea is generating defensive prompts that trigger the safety mechanisms of malicious LLM agents.<n>AutoGuard achieves over 80% Defense Success Rate (DSR) across diverse malicious agents.
arXiv Detail & Related papers (2025-09-26T02:20:46Z) - BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability [8.539128225018489]
BlockA2A is a unified multi-agent trust framework for agent-to-agent interoperability.<n>It eliminates centralized trust bottlenecks, ensures message authenticity and execution integrity, and guarantees accountability across agent interactions.<n>It neutralizes attacks through real-time mechanisms, including Byzantine agent flagging, reactive execution halting, and instant permission revocation.
arXiv Detail & Related papers (2025-08-02T11:59:21Z) - The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover [0.18472148461613155]
Large Language Model (LLM) agents and multi-agent systems introduce unprecedented security vulnerabilities.<n>This paper presents a comprehensive evaluation of the security of LLMs used as reasoning engines within autonomous agents.<n>We focus on how different attack surfaces and trust boundaries can be leveraged to orchestrate such takeovers.
arXiv Detail & Related papers (2025-07-09T13:54:58Z) - OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z) - LLM Agents Should Employ Security Principles [60.03651084139836]
This paper argues that the well-established design principles in information security should be employed when deploying Large Language Model (LLM) agents at scale.<n>We introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle.
arXiv Detail & Related papers (2025-05-29T21:39:08Z) - SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator [77.86600052899156]
Large Language Model (LLM)-based agents are increasingly deployed in real-world applications.<n>We propose AutoSafe, the first framework that systematically enhances agent safety through fully automated synthetic data generation.<n>We show that AutoSafe boosts safety scores by 45% on average and achieves a 28.91% improvement on real-world tasks.
arXiv Detail & Related papers (2025-05-23T10:56:06Z) - From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents [17.62574693254363]
We present the first comprehensive security analysis of mobile large language models (LLMs)<n>We identify security threats across three core capability dimensions: language-based reasoning, GUI-based interaction, and system-level execution.<n>Our analysis reveals 11 distinct attack surfaces, all rooted in the unique capabilities and interaction patterns of mobile LLM agents.
arXiv Detail & Related papers (2025-05-19T11:17:46Z) - AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z) - Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents [16.559272781032632]
We present a systematic security investigation of multi-modal mobile GUI agents, addressing this critical gap in the existing literature.<n>Our contributions are twofold: (1) we propose a novel threat modeling methodology, leading to the discovery and feasibility analysis of 34 previously unreported attacks, and (2) we design an attack framework to systematically construct and evaluate these threats.
arXiv Detail & Related papers (2024-07-12T14:30:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.