Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis
- URL: http://arxiv.org/abs/2502.20383v2
- Date: Wed, 11 Jun 2025 03:01:49 GMT
- Title: Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis
- Authors: Jeffrey Yang Fan Chiang, Seungjae Lee, Jia-Bin Huang, Furong Huang, Yizheng Chen,
- Abstract summary: This study investigates the underlying factors that contribute to the increased vulnerability of Web AI agents.<n>We identify three critical factors that amplify the vulnerability of Web AI agents; (1) embedding user goals into the system prompt, (2) multi-step action generation, and (3) observational capabilities.
- Score: 35.57217841344101
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Web AI agents have demonstrated remarkable capabilities in addressing complex web navigation tasks. However, emerging research shows that these agents exhibit greater vulnerability compared to standalone Large Language Models (LLMs), despite both being built upon the same safety-aligned models. This discrepancy is particularly concerning given the greater flexibility of Web AI Agent compared to standalone LLMs, which may expose them to a wider range of adversarial user inputs. To build a scaffold that addresses these concerns, this study investigates the underlying factors that contribute to the increased vulnerability of Web AI agents. Notably, this disparity stems from the multifaceted differences between Web AI agents and standalone LLMs, as well as the complex signals - nuances that simple evaluation metrics, such as success rate, often fail to capture. To tackle these challenges, we propose a component-level analysis and a more granular, systematic evaluation framework. Through this fine-grained investigation, we identify three critical factors that amplify the vulnerability of Web AI agents; (1) embedding user goals into the system prompt, (2) multi-step action generation, and (3) observational capabilities. Our findings highlights the pressing need to enhance security and robustness in AI agent design and provide actionable insights for targeted defense strategies.
Related papers
- Bridging AI and Software Security: A Comparative Vulnerability Assessment of LLM Agent Deployment Paradigms [1.03121181235382]
Large Language Model (LLM) agents face security vulnerabilities spanning AI-specific and traditional software domains.<n>This study bridges this gap through comparative evaluation of Function Calling architecture and Model Context Protocol (MCP) deployment paradigms.<n>We tested 3,250 attack scenarios across seven language models, evaluating simple, composed, and chained attacks targeting both AI-specific threats and software vulnerabilities.
arXiv Detail & Related papers (2025-07-08T18:24:28Z) - Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection [38.083049237330826]
This study explores the use of Large Language Models (LLMs) in software vulnerability assessment by simulating the identification of Python code with known Common Weaknessions (CWEs)<n>Our results indicate that while zero-shot prompting performs poorly, few-shot prompting significantly enhances classification performance.<n> challenges such as model reliability, interpretability, and adversarial robustness remain critical areas for future research.
arXiv Detail & Related papers (2025-06-11T18:43:51Z) - ATAG: AI-Agent Application Threat Assessment with Attack Graphs [23.757154032523093]
This paper introduces AI-agent application Threat assessment with Attack Graphs (ATAG)<n>ATAG is a novel framework designed to systematically analyze the security risks associated with AI-agent applications.<n>It facilitates proactive identification and mitigation of AI-agent threats in multi-agent applications.
arXiv Detail & Related papers (2025-06-03T13:25:40Z) - WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback [74.82886755416949]
We identify key reasoning skills essential for effective web agents.<n>We reconstruct the agent's reasoning algorithms into chain-of-thought rationales.<n>Our approach yields significant improvements across multiple benchmarks.
arXiv Detail & Related papers (2025-05-26T14:03:37Z) - Safety Devolution in AI Agents [56.482973617087254]
This study investigates how expanding retrieval access affects model reliability, bias propagation, and harmful content generation.<n>Retrieval-augmented agents built on aligned LLMs often behave more unsafely than uncensored models without retrieval.<n>These findings underscore the need for robust mitigation strategies to ensure fairness and reliability in retrieval-augmented and increasingly autonomous AI systems.
arXiv Detail & Related papers (2025-05-20T11:21:40Z) - AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z) - Threat Modeling for AI: The Case for an Asset-Centric Approach [0.23408308015481666]
AI systems now able to autonomously execute code, interact with external systems, and operate without human oversight.<n>With AI systems now able to autonomously execute code, interact with external systems, and operate without human oversight, traditional security approaches fall short.<n>This paper introduces an asset-centric methodology for threat modeling AI systems.
arXiv Detail & Related papers (2025-05-08T18:57:08Z) - Securing Agentic AI: A Comprehensive Threat Model and Mitigation Framework for Generative AI Agents [0.0]
This paper introduces a comprehensive threat model tailored specifically for GenAI agents.
Research work identifies 9 primary threats and organizes them across five key domains.
arXiv Detail & Related papers (2025-04-28T16:29:24Z) - LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection [0.0]
Large Language Models (LLMs) for vulnerability detection presents unique challenges.
Previous attempts employing machine learning models for vulnerability detection have proven ineffective.
We propose a robust AI-driven approach focused on mitigating these limitations.
arXiv Detail & Related papers (2025-04-25T15:30:40Z) - Towards Trustworthy GUI Agents: A Survey [64.6445117343499]
This survey examines the trustworthiness of GUI agents in five critical dimensions.
We identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making.
As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential.
arXiv Detail & Related papers (2025-03-30T13:26:00Z) - Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks.<n>We identify three key failure modes based on agents' incentives, as well as seven key risk factors.<n>We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z) - AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents [52.13695464678006]
This study enhances an LLM-based web agent by simply refining its observation and action space.
AgentOccam surpasses the previous state-of-the-art and concurrent work by 9.8 (+29.4%) and 5.9 (+15.8%) absolute points respectively.
arXiv Detail & Related papers (2024-10-17T17:50:38Z) - Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification [35.16099878559559]
Large language models (LLMs) have experienced significant development and are being deployed in real-world applications.
We introduce a new type of attack that causes malfunctions by misleading the agent into executing repetitive or irrelevant actions.
Our experiments reveal that these attacks can induce failure rates exceeding 80% in multiple scenarios.
arXiv Detail & Related papers (2024-07-30T14:35:31Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z) - Security of AI Agents [5.468745160706382]
We identify and describe potential vulnerabilities in AI agents in detail from a system security perspective.<n>We introduce defense mechanisms corresponding to each vulnerability with design and experiments to evaluate their viability.<n>This paper contextualizes the security issues in the current development of AI agents and delineates methods to make AI agents safer and more reliable.
arXiv Detail & Related papers (2024-06-12T23:16:45Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.