A Systematization of Security Vulnerabilities in Computer Use Agents
- URL: http://arxiv.org/abs/2507.05445v1
- Date: Mon, 07 Jul 2025 19:50:21 GMT
- Title: A Systematization of Security Vulnerabilities in Computer Use Agents
- Authors: Daniel Jones, Giorgio Severi, Martin Pouliot, Gary Lopez, Joris de Gruyter, Santiago Zanella-Beguelin, Justin Song, Blake Bullwinkel, Pamela Cortez, Amanda Minnich,
- Abstract summary: We conduct a systematic threat analysis and testing of real-world CUAs under adversarial conditions.<n>We identify seven classes of risks unique to the CUA paradigm, and analyze three concrete exploit scenarios in depth.<n>These case studies reveal deeper architectural flaws across current CUA implementations.
- Score: 1.3560089220432787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer Use Agents (CUAs), autonomous systems that interact with software interfaces via browsers or virtual machines, are rapidly being deployed in consumer and enterprise environments. These agents introduce novel attack surfaces and trust boundaries that are not captured by traditional threat models. Despite their growing capabilities, the security boundaries of CUAs remain poorly understood. In this paper, we conduct a systematic threat analysis and testing of real-world CUAs under adversarial conditions. We identify seven classes of risks unique to the CUA paradigm, and analyze three concrete exploit scenarios in depth: (1) clickjacking via visual overlays that mislead interface-level reasoning, (2) indirect prompt injection that enables Remote Code Execution (RCE) through chained tool use, and (3) CoT exposure attacks that manipulate implicit interface framing to hijack multi-step reasoning. These case studies reveal deeper architectural flaws across current CUA implementations. Namely, a lack of input provenance tracking, weak interface-action binding, and insufficient control over agent memory and delegation. We conclude by proposing a CUA-specific security evaluation framework and design principles for safe deployment in adversarial and high-stakes settings.
Related papers
- OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage [59.3826294523924]
We investigate the security vulnerabilities of a popular multi-agent pattern known as the orchestrator setup.<n>We report the susceptibility of frontier models to different categories of attacks, finding that both reasoning and non-reasoning models are vulnerable.
arXiv Detail & Related papers (2026-02-13T21:32:32Z) - Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs [65.6660735371212]
We present textbftextscJustAsk, a framework that autonomously discovers effective extraction strategies through interaction alone.<n>It formulates extraction as an online exploration problem, using Upper Confidence Bound--based strategy selection and a hierarchical skill space spanning atomic probes and high-level orchestration.<n>Our results expose system prompts as a critical yet largely unprotected attack surface in modern agent systems.
arXiv Detail & Related papers (2026-01-29T03:53:25Z) - ORCA -- An Automated Threat Analysis Pipeline for O-RAN Continuous Development [57.61878484176942]
Open-Radio Access Network (O-RAN) integrates numerous software components in a cloud-like deployment, opening the radio access network to previously unconsidered security threats.<n>Current vulnerability assessment practices often rely on manual, labor-intensive, and subjective investigations, leading to inconsistencies in the threat analysis.<n>We propose an automated pipeline that leverages Natural Language Processing (NLP) to minimize human intervention and associated biases.
arXiv Detail & Related papers (2026-01-20T07:31:59Z) - CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents [60.98294016925157]
AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss.<n>We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content.<n>Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks.
arXiv Detail & Related papers (2026-01-14T23:06:35Z) - MCPGuard : Automatically Detecting Vulnerabilities in MCP Servers [16.620755774987774]
The Model Context Protocol (MCP) has emerged as a standardized interface enabling seamless integration between Large Language Models (LLMs) and external data sources and tools.<n>This paper systematically analyzes the security landscape of MCP-based systems, identifying three principal threat categories.
arXiv Detail & Related papers (2025-10-27T05:12:51Z) - HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities [20.201614123811872]
HackWorld is the first framework for evaluating computer-use agents' capabilities to exploit web application vulnerabilities via visual interaction.<n>It includes 36 real-world applications across 11 frameworks and 7 languages, featuring realistic flaws such as injection vulnerabilities, authentication bypasses, and unsafe input handling.<n>It tests CUAs' capacity to identify and exploit these weaknesses while navigating complex web interfaces.
arXiv Detail & Related papers (2025-10-14T06:52:15Z) - Towards Unifying Quantitative Security Benchmarking for Multi Agent Systems [0.0]
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols.<n>One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust.<n>In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs.
arXiv Detail & Related papers (2025-07-23T13:51:28Z) - OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents [74.6761188527948]
Computer-Use Agents (CUAs) with full system access pose significant security and privacy risks.<n>We investigate Visual Prompt Injection (VPI) attacks, where malicious instructions are visually embedded within rendered user interfaces.<n>Our empirical study shows that current CUAs and BUAs can be deceived at rates of up to 51% and 100%, respectively, on certain platforms.
arXiv Detail & Related papers (2025-06-03T05:21:50Z) - The Hidden Dangers of Browsing AI Agents [0.0]
This paper presents a comprehensive security evaluation of such agents, focusing on systemic vulnerabilities across multiple architectural layers.<n>Our work outlines the first end-to-end threat model for browsing agents and provides actionable guidance for securing their deployment in real-world environments.
arXiv Detail & Related papers (2025-05-19T13:10:29Z) - A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron? [30.063392019347887]
We present a systematization of knowledge on the safety and security threats of emphComputer-Using Agents.<n> CUAs are capable of autonomously performing tasks such as navigating desktop applications, web pages, and mobile apps.
arXiv Detail & Related papers (2025-05-16T06:56:42Z) - CANTXSec: A Deterministic Intrusion Detection and Prevention System for CAN Bus Monitoring ECU Activations [53.036288487863786]
We propose CANTXSec, the first deterministic Intrusion Detection and Prevention system based on physical ECU activations.<n>It detects and prevents classical attacks in the CAN bus, while detecting advanced attacks that have been less investigated in the literature.<n>We prove the effectiveness of our solution on a physical testbed, where we achieve 100% detection accuracy in both classes of attacks while preventing 100% of FIAs.
arXiv Detail & Related papers (2025-05-14T13:37:07Z) - RedTeamLLM: an Agentic AI framework for offensive security [0.0]
We propose and evaluate RedTeamLLM, an integrated architecture with a comprehensive security model for automatization of pentest tasks.<n>RedTeamLLM follows three key steps: summarizing, reasoning and act, which embed its operational capacity.<n> Evaluation is performed through the automated resolution of a range of entry-level, but not trivial, CTF challenges.
arXiv Detail & Related papers (2025-05-11T09:19:10Z) - Autonomous Identity-Based Threat Segmentation in Zero Trust Architectures [4.169915659794567]
Zero Trust Architectures (ZTA) fundamentally redefine network security by adopting a "trust nothing, verify everything" approach.<n>This research applies the proposed AI-driven, autonomous, identity-based threat segmentation in ZTA.
arXiv Detail & Related papers (2025-01-10T15:35:02Z) - A System for Automated Open-Source Threat Intelligence Gathering and
Management [53.65687495231605]
SecurityKG is a system for automated OSCTI gathering and management.
It uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors.
arXiv Detail & Related papers (2021-01-19T18:31:35Z) - A System for Efficiently Hunting for Cyber Threats in Computer Systems
Using Threat Intelligence [78.23170229258162]
We build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI.
ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, and (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors.
arXiv Detail & Related papers (2021-01-17T19:44:09Z) - Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence [94.94833077653998]
ThreatRaptor is a system that facilitates threat hunting in computer systems using open-source Cyber Threat Intelligence (OSCTI)
It extracts structured threat behaviors from unstructured OSCTI text and uses a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities.
Evaluations on a broad set of attack cases demonstrate the accuracy and efficiency of ThreatRaptor in practical threat hunting.
arXiv Detail & Related papers (2020-10-26T14:54:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.