Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains
- URL: http://arxiv.org/abs/2602.19555v1
- Date: Mon, 23 Feb 2026 06:57:57 GMT
- Title: Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains
- Authors: Xiaochong Jiang, Shiqi Yang, Wenting Yang, Yichen Liu, Cheng Ji,
- Abstract summary: Agentic systems built on large language models (LLMs) extend beyond text generation to autonomously retrieve information and invoke tools.<n>This runtime execution model shifts the attack surface from build-time artifacts to inference-time dependencies, exposing agents to manipulation through untrusted data and probabilistic capability resolution.<n>We systematize these risks within a unified runtime framework, categorizing threats into data supply chain attacks (transient context injection and persistent memory poisoning)<n>We also identify the Viral Agent Loop, in which agents act as vectors for self-propagating generative worms without exploiting code-level flaws.
- Score: 7.8562769948743965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Agentic systems built on large language models (LLMs) extend beyond text generation to autonomously retrieve information and invoke tools. This runtime execution model shifts the attack surface from build-time artifacts to inference-time dependencies, exposing agents to manipulation through untrusted data and probabilistic capability resolution. While prior work has focused on model-level vulnerabilities, security risks emerging from cyclic and interdependent runtime behavior remain fragmented. We systematize these risks within a unified runtime framework, categorizing threats into data supply chain attacks (transient context injection and persistent memory poisoning) and tool supply chain attacks (discovery, implementation, and invocation). We further identify the Viral Agent Loop, in which agents act as vectors for self-propagating generative worms without exploiting code-level flaws. Finally, we advocate a Zero-Trust Runtime Architecture that treats context as untrusted control flow and constrains tool execution through cryptographic provenance rather than semantic inference.
Related papers
- Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections [57.64370755825839]
Self-evolving agents update their internal state across sessions, often by writing and reusing long-term memory.<n>We study this risk and formalize a persistent attack we call a Zombie Agent.<n>We present a black-box attack framework that uses only indirect exposure through attacker-controlled web content.
arXiv Detail & Related papers (2026-02-17T15:28:24Z) - Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy [4.058281338403478]
We propose a threat modeling framework called AgentHeLLM that separates asset identification from attack path analysis.<n>We introduce a human-centric asset taxonomy derived from harm-oriented "victim modeling" and inspired by the Universal Declaration of Human Rights.<n>We demonstrate the framework's practical applicability through an open-source attack path suggestion tool AgentHeLLM Attack Path Generator.
arXiv Detail & Related papers (2026-02-05T16:53:41Z) - Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs [65.6660735371212]
We present textbftextscJustAsk, a framework that autonomously discovers effective extraction strategies through interaction alone.<n>It formulates extraction as an online exploration problem, using Upper Confidence Bound--based strategy selection and a hierarchical skill space spanning atomic probes and high-level orchestration.<n>Our results expose system prompts as a critical yet largely unprotected attack surface in modern agent systems.
arXiv Detail & Related papers (2026-01-29T03:53:25Z) - CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents [60.98294016925157]
AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss.<n>We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content.<n>Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks.
arXiv Detail & Related papers (2026-01-14T23:06:35Z) - BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents [58.83028403414688]
Large language model (LLM) agents execute tasks through multi-step workflow that combine planning, memory, and tool use.<n>Backdoor triggers injected into specific stages of an agent workflow can persist through multiple intermediate states and adversely influence downstream outputs.<n>We propose textbfBackdoorAgent, a modular and stage-aware framework that provides a unified agent-centric view of backdoor threats in LLM agents.
arXiv Detail & Related papers (2026-01-08T03:49:39Z) - Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation [0.0]
The current paper includes an example of agentic artificial intelligence (AI) based on autonomous software supply chain security.<n>It combines large language model (LLM)-based reasoning, reinforcement learning (RL), and multi-agent coordination.<n>Results show that agentic AI can facilitate the transition to self defending, proactive software supply chains.
arXiv Detail & Related papers (2025-12-29T14:06:09Z) - Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain [82.98626829232899]
Fine-tuning AI agents on data from their own interactions introduces a critical security vulnerability within the AI supply chain.<n>We show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors.
arXiv Detail & Related papers (2025-10-03T12:47:21Z) - Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE [64.47951172662745]
Cuckoo Attack is a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files.<n>We formalize our attack paradigm into two stages, including initial infection and persistence.<n>We contribute seven actionable checkpoints for vendors to evaluate their product security.
arXiv Detail & Related papers (2025-09-19T04:10:52Z) - IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents [33.775221377823925]
Large language model (LLM) agents are widely deployed in real-world applications, where they leverage tools to retrieve and manipulate external data for complex tasks.<n>When interacting with untrusted data sources, tool responses may contain injected instructions that covertly influence agent behaviors and lead to malicious outcomes.<n>We propose a novel defensive task execution paradigm, called IPIGuard, to prevent malicious tool invocations at the source.
arXiv Detail & Related papers (2025-08-21T07:08:16Z) - Towards Unifying Quantitative Security Benchmarking for Multi Agent Systems [0.0]
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols.<n>One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust.<n>In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs.
arXiv Detail & Related papers (2025-07-23T13:51:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.