WIPI: A New Web Threat for LLM-Driven Web Agents
- URL: http://arxiv.org/abs/2402.16965v1
- Date: Mon, 26 Feb 2024 19:01:54 GMT
- Title: WIPI: A New Web Threat for LLM-Driven Web Agents
- Authors: Fangzhou Wu, Shutong Wu, Yulong Cao, Chaowei Xiao
- Abstract summary: We introduce a novel threat, WIPI, that indirectly controls Web Agent to execute malicious instructions embedded in publicly accessible webpages.
To launch a successful WIPI works in a black-box environment.
Our methodology achieves an average attack success rate (ASR) exceeding 90% even in pure black-box scenarios.
- Score: 28.651763099760664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the fast development of large language models (LLMs), LLM-driven Web
Agents (Web Agents for short) have obtained tons of attention due to their
superior capability where LLMs serve as the core part of making decisions like
the human brain equipped with multiple web tools to actively interact with
external deployed websites. As uncountable Web Agents have been released and
such LLM systems are experiencing rapid development and drawing closer to
widespread deployment in our daily lives, an essential and pressing question
arises: "Are these Web Agents secure?". In this paper, we introduce a novel
threat, WIPI, that indirectly controls Web Agent to execute malicious
instructions embedded in publicly accessible webpages. To launch a successful
WIPI works in a black-box environment. This methodology focuses on the form and
content of indirect instructions within external webpages, enhancing the
efficiency and stealthiness of the attack. To evaluate the effectiveness of the
proposed methodology, we conducted extensive experiments using 7 plugin-based
ChatGPT Web Agents, 8 Web GPTs, and 3 different open-source Web Agents. The
results reveal that our methodology achieves an average attack success rate
(ASR) exceeding 90% even in pure black-box scenarios. Moreover, through an
ablation study examining various user prefix instructions, we demonstrated that
the WIPI exhibits strong robustness, maintaining high performance across
diverse prefix instructions.
Related papers
- Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study [16.559272781032632]
The rapid progress in the reasoning capability of the Multi-modal Large Language Models has triggered the development of autonomous agent systems on mobile devices.
Despite the increased human-machine interaction efficiency, the security risks of MLLM-based mobile agent systems have not been systematically studied.
This paper highlights the need for security awareness in the design of MLLM-based systems and paves the way for future research on attacks and defense methods.
arXiv Detail & Related papers (2024-07-12T14:30:05Z) - Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only [21.054681757006385]
Large Language Models (LLMs) with advanced reasoning capabilities have set the stage for agents to undertake more complex and previously unseen tasks.
We propose an agent that functions solely on the basis of screenshots for recognizing environments.
We achieve a success rate of 94.4% on 67types of MiniWoB++ problems, utilizing only 1.48demonstrations per problem type.
arXiv Detail & Related papers (2024-06-11T05:21:20Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents [3.5248694676821484]
We introduce InjecAgent, a benchmark designed to assess the vulnerability of tool-integrated LLM agents to IPI attacks.
InjecAgent comprises 1,054 test cases covering 17 different user tools and 62 attacker tools.
We show that agents are vulnerable to IPI attacks, with ReAct-prompted GPT-4 vulnerable to attacks 24% of the time.
arXiv Detail & Related papers (2024-03-05T06:21:45Z) - Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based
Agents [50.034049716274005]
We take the first step to investigate one of the typical safety threats, backdoor attack, to LLM-based agents.
We first formulate a general framework of agent backdoor attacks, then we present a thorough analysis on the different forms of agent backdoor attacks.
We propose the corresponding data poisoning mechanisms to implement the above variations of agent backdoor attacks on two typical agent tasks.
arXiv Detail & Related papers (2024-02-17T06:48:45Z) - WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models [65.18602126334716]
Existing web agents typically only handle one input modality and are evaluated only in simplified web simulators or static web snapshots.
We introduce WebVoyager, an innovative Large Multimodal Model (LMM) powered web agent that can complete user instructions end-to-end by interacting with real-world websites.
We show that WebVoyager achieves a 59.1% task success rate on our benchmark, significantly surpassing the performance of both GPT-4 (All Tools) and the WebVoyager (text-only) setups.
arXiv Detail & Related papers (2024-01-25T03:33:18Z) - VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks [93.85005277463802]
VisualWebArena is a benchmark designed to assess the performance of multimodal web agents on realistic tasks.
To perform on this benchmark, agents need to accurately process image-text inputs, interpret natural language instructions, and execute actions on websites to accomplish user-defined objectives.
arXiv Detail & Related papers (2024-01-24T18:35:21Z) - Not what you've signed up for: Compromising Real-World LLM-Integrated
Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications.
We show how attackers can override original instructions and employed controls using Prompt Injection attacks.
We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.