Related papers: In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers

In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers

URL: http://arxiv.org/abs/2510.13543v1
Date: Wed, 15 Oct 2025 13:39:13 GMT
Title: In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
Authors: Avihay Cohen,
Abstract summary: Large Language Model (LLM) based agents integrated into web browsers offer powerful automation of web tasks.<n>They are vulnerable to indirect prompt injection attacks, where malicious instructions hidden in a webpage deceive the agent into unwanted actions.<n>We present a novel fuzzing framework that runs entirely in the browser and is guided by an LLM to automatically discover such prompt injection vulnerabilities in real time.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Model (LLM) based agents integrated into web browsers (often called agentic AI browsers) offer powerful automation of web tasks. However, they are vulnerable to indirect prompt injection attacks, where malicious instructions hidden in a webpage deceive the agent into unwanted actions. These attacks can bypass traditional web security boundaries, as the AI agent operates with the user privileges across sites. In this paper, we present a novel fuzzing framework that runs entirely in the browser and is guided by an LLM to automatically discover such prompt injection vulnerabilities in real time.

Related papers

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks [10.431616150153992]
MUZZLE is an automated framework for evaluating the security of web agents against indirect prompt injection attacks.<n>It adapts its attack strategy based on the agent's observed execution trajectory and iteratively refines attacks using feedback from failed executions.<n>MUZZLE effectively discovers 37 new attacks on 4 web applications with 10 adversarial objectives that violate confidentiality, availability, or privacy properties.
arXiv Detail & Related papers (2026-02-09T21:46:18Z)
It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents [52.81924177620322]
Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking.<n>Their reliance on dynamic web content makes them vulnerable to prompt injection attacks: adversarial instructions hidden in interface elements that persuade the agent to divert from its original task.<n>We introduce the Task-Redirecting Agent Persuasion Benchmark (TRAP), an evaluation for studying how persuasion techniques misguide autonomous web agents on realistic tasks.
arXiv Detail & Related papers (2025-12-29T01:09:10Z)
ceLLMate: Sandboxing Browser AI Agents [16.060034673487287]
We propose ceLLMate, a browser-level sandboxing framework that restricts the agent's ambient authority and reduces the blast radius of prompt injections.<n> ceLLMate pairs website-authored mandatory policies with an automated policy-prediction layer that adapts and instantiates these policies from the user's natural-language task.<n>We implement ceLLMate as an agent-agnostic browser extension and demonstrate how it enables sandboxing policies that effectively block various types of prompt injection attacks with negligible overhead.
arXiv Detail & Related papers (2025-12-14T08:25:31Z)
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods [95.54363609024847]
Large language models (LLMs) are vulnerable to prompt injection attacks.<n>In this paper, we explore more vicious attacks that nullify the prompt injection defense methods.<n> backdoor-powered prompt injection attacks are more harmful than previous prompt injection attacks.
arXiv Detail & Related papers (2025-10-04T07:11:11Z)
Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE [64.47951172662745]
Cuckoo Attack is a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files.<n>We formalize our attack paradigm into two stages, including initial infection and persistence.<n>We contribute seven actionable checkpoints for vendors to evaluate their product security.
arXiv Detail & Related papers (2025-09-19T04:10:52Z)
Cybersecurity AI: Hacking the AI Hackers via Prompt Injection [0.5043403308699188]
We demonstrate how AI-powered cybersecurity tools can be turned against themselves through prompt injection attacks.<n>Prompt injection is reminiscent of cross-site scripting (XSS)<n>We present proof-of-concept exploits against the Cybersecurity AI framework and its CLI tool.
arXiv Detail & Related papers (2025-08-29T14:32:48Z)
A Whole New World: Creating a Parallel-Poisoned Web Only AI-Agents Can See [0.0]
A malicious website can identify an incoming request as originating from an AI agent and dynamically serve a different, "cloaked" version of its content.<n>While human users see a benign webpage, the agent is presented with a visually identical page embedded with hidden, malicious instructions.<n>This work formalizes the threat model, details the mechanics of agent fingerprinting and cloaking, and discusses the profound security implications for the future of agentic AI.
arXiv Detail & Related papers (2025-08-29T08:14:52Z)
Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree [8.511846002129522]
We show that adversaries can embed universal adversarial triggers in webpage HTML to hijack agent behavior.<n>Our system demonstrates high success rates across real websites in both targeted and general attacks.
arXiv Detail & Related papers (2025-07-20T03:10:13Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks [36.97842000562324]
We introduce WASP -- a new benchmark for end-to-end evaluation of Web Agent Security against Prompt injection attacks.<n>We show that even top-tier AI models, including those with advanced reasoning capabilities, can be deceived by simple, low-effort human-written injections.<n>Our end-to-end evaluation reveals a previously unobserved insight: while attacks partially succeed in up to 86% of the case, even state-of-the-art agents often struggle to fully complete the attacker goals.
arXiv Detail & Related papers (2025-04-22T17:51:03Z)
Attacking Vision-Language Computer Agents via Pop-ups [61.744008541021124]
We show that VLM agents can be easily attacked by a set of carefully designed adversarial pop-ups.<n>This distraction leads agents to click these pop-ups instead of performing their tasks as usual.<n>Basic defense techniques, such as asking the agent to ignore pop-ups or including an advertisement notice, are ineffective against the attack.
arXiv Detail & Related papers (2024-11-04T18:56:42Z)
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications. We show how attackers can override original instructions and employed controls using Prompt Injection attacks. We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.