AEIA-MN: Evaluating the Robustness of Multimodal LLM-Powered Mobile Agents Against Active Environmental Injection Attacks
- URL: http://arxiv.org/abs/2502.13053v1
- Date: Tue, 18 Feb 2025 17:01:28 GMT
- Title: AEIA-MN: Evaluating the Robustness of Multimodal LLM-Powered Mobile Agents Against Active Environmental Injection Attacks
- Authors: Yurun Chen, Xueyu Hu, Keting Yin, Juncheng Li, Shengyu Zhang,
- Abstract summary: AEIA-MN exploits interaction vulnerabilities in the mobile operating system to evaluate the robustness of MLLM-based agents against such threats.
Even advanced MLLMs are highly vulnerable to this attack, achieving a maximum attack success rate of 93% in the AndroidWorld benchmark.
- Score: 7.956861233179047
- License:
- Abstract: As researchers continuously optimize AI agents to perform tasks more effectively within operating systems, they often neglect to address the critical need for enabling these agents to identify "impostors" within the system. Through an analysis of the agents' operating environment, we identified a potential threat: attackers can disguise their attack methods as environmental elements, injecting active disturbances into the agents' execution process, thereby disrupting their decision-making. We define this type of attack as Active Environment Injection Attack (AEIA). Based on this, we propose AEIA-MN, an active environment injection attack scheme that exploits interaction vulnerabilities in the mobile operating system to evaluate the robustness of MLLM-based agents against such threats. Experimental results show that even advanced MLLMs are highly vulnerable to this attack, achieving a maximum attack success rate of 93% in the AndroidWorld benchmark.
Related papers
- MELON: Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison [60.30753230776882]
LLM agents are vulnerable to indirect prompt injection (IPI) attacks.
We present MELON, a novel IPI defense.
We show that MELON outperforms SOTA defenses in both attack prevention and utility preservation.
arXiv Detail & Related papers (2025-02-07T18:57:49Z) - Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving [65.61999354218628]
We take the first step toward designing black-box adversarial attacks specifically targeting vision-language models (VLMs) in autonomous driving systems.
We propose Cascading Adversarial Disruption (CAD), which targets low-level reasoning breakdown by generating and injecting semantics.
We present Risky Scene Induction, which addresses dynamic adaptation by leveraging a surrogate VLM to understand and construct high-level risky scenarios.
arXiv Detail & Related papers (2025-01-23T11:10:02Z) - Hijacking Vision-and-Language Navigation Agents with Adversarial Environmental Attacks [12.96291706848273]
Vision-and-Language Navigation (VLN) task as a representative setting.
Whitebox adversarial attack developed to induce desired behaviors in pretrained VLN agents.
We find our attacks can induce early-termination behaviors or divert an agent along an attacker-defined multi-step trajectory.
arXiv Detail & Related papers (2024-12-03T19:54:32Z) - Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics [70.93622520400385]
This paper systematically quantifies the robustness of VLA-based robotic systems.
We introduce an untargeted position-aware attack objective that leverages spatial foundations to destabilize robotic actions.
We also design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments.
arXiv Detail & Related papers (2024-11-18T01:52:20Z) - Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In [5.65782619470663]
We examine how ReAct agents can be exploited using a straightforward yet effective method we refer to as the foot-in-the-door attack.
Our experiments show that indirect prompt injection attacks can significantly increase the likelihood of the agent performing subsequent malicious actions.
To mitigate this vulnerability, we propose implementing a simple reflection mechanism that prompts the agent to reassess the safety of its actions during execution.
arXiv Detail & Related papers (2024-10-22T12:24:41Z) - Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence.
This paper uncovers a significant backdoor security threat within this process.
By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z) - Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification [35.16099878559559]
Large language models (LLMs) have experienced significant development and are being deployed in real-world applications.
We introduce a new type of attack that causes malfunctions by misleading the agent into executing repetitive or irrelevant actions.
Our experiments reveal that these attacks can induce failure rates exceeding 80% in multiple scenarios.
arXiv Detail & Related papers (2024-07-30T14:35:31Z) - Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems [27.316115171846953]
Large Language Models (LLMs) have shown significant promise in real-world decision-making tasks for embodied AI.
LLMs are fine-tuned to leverage their inherent common sense and reasoning abilities while being tailored to specific applications.
This fine-tuning process introduces considerable safety and security vulnerabilities, especially in safety-critical cyber-physical systems.
arXiv Detail & Related papers (2024-05-27T17:59:43Z) - Highlighting the Safety Concerns of Deploying LLMs/VLMs in Robotics [54.57914943017522]
We highlight the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications.
arXiv Detail & Related papers (2024-02-15T22:01:45Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.
Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.
We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - Raij\=u: Reinforcement Learning-Guided Post-Exploitation for Automating
Security Assessment of Network Systems [0.0]
Raij=u framework is a Reinforcement Learning-driven automation approach.
We implement two RL algorithms to train specialized agents capable of making intelligent actions.
Agents achieve over 84% of successful attacks with under 55 attack steps given.
arXiv Detail & Related papers (2023-09-27T09:36:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.