Related papers: Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime

Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime

URL: http://arxiv.org/abs/2602.09433v1
Date: Tue, 10 Feb 2026 05:57:30 GMT
Title: Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime
Authors: Herman Errico,
Abstract summary: This paper introduces Autonomous Action Management (AARM), an open specification for securing AI-driven actions at runtime.<n>AARM intercepts actions before execution, accumulates session context, evaluates against policy and intent alignment, enforces authorization decisions, and records tamper-evident receipts for forensic reconstruction.<n>AARM is model-agnostic, framework-agnostic, and vendor-neutral, treating action execution as the stable security boundary.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As artificial intelligence systems evolve from passive assistants into autonomous agents capable of executing consequential actions, the security boundary shifts from model outputs to tool execution. Traditional security paradigms - log aggregation, perimeter defense, and post-hoc forensics - cannot protect systems where AI-driven actions are irreversible, execute at machine speed, and originate from potentially compromised orchestration layers. This paper introduces Autonomous Action Runtime Management (AARM), an open specification for securing AI-driven actions at runtime. AARM defines a runtime security system that intercepts actions before execution, accumulates session context, evaluates against policy and intent alignment, enforces authorization decisions, and records tamper-evident receipts for forensic reconstruction. We formalize a threat model addressing prompt injection, confused deputy attacks, data exfiltration, and intent drift. We introduce an action classification framework distinguishing forbidden, context-dependent deny, and context-dependent allow actions. We propose four implementation architectures - protocol gateway, SDK instrumentation, kernel eBPF, and vendor integration - with distinct trust properties, and specify minimum conformance requirements for AARM-compliant systems. AARM is model-agnostic, framework-agnostic, and vendor-neutral, treating action execution as the stable security boundary. This specification aims to establish industry-wide requirements before proprietary fragmentation forecloses interoperability.

Related papers

Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs [65.6660735371212]
We present textbftextscJustAsk, a framework that autonomously discovers effective extraction strategies through interaction alone.<n>It formulates extraction as an online exploration problem, using Upper Confidence Bound--based strategy selection and a hierarchical skill space spanning atomic probes and high-level orchestration.<n>Our results expose system prompts as a critical yet largely unprotected attack surface in modern agent systems.
arXiv Detail & Related papers (2026-01-29T03:53:25Z)
Faramesh: A Protocol-Agnostic Execution Control Plane for Autonomous Agent Systems [0.0]
Faramesh is a protocol-agnostic execution control plane that enforces execution-time authorization for agent-driven actions.<n>We show how these primitives yield enforceable, predictable governance for autonomous execution.
arXiv Detail & Related papers (2026-01-25T08:27:27Z)
CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents [60.98294016925157]
AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss.<n>We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content.<n>Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks.
arXiv Detail & Related papers (2026-01-14T23:06:35Z)
Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z)
Securing the Model Context Protocol (MCP): Risks, Controls, and Governance [1.4072883206858737]
We focus on three types of adversaries that take advantage of MCP s flexibility.<n>Based on early incidents and proof-of-concept attacks, we describe how MCP can increase the attack surface.<n>We propose a set of practical controls, including per-user authentication with scoped authorization.
arXiv Detail & Related papers (2025-11-25T23:24:26Z)
IRSDA: An Agent-Orchestrated Framework for Enterprise Intrusion Response [7.470506991479105]
Intrusion Response System Digital Assistant (IRSDA) is an agent-based framework designed to deliver autonomous and policy-compliant cyber defense.<n>IRSDA incorporates a knowledge-driven architecture that integrates contextual information with AI-based reasoning to support system-guided intrusion response.<n>This work outlines a modular agent-driven approach to cyber defense that emphasizes explainability, system-state awareness, and operational control in intrusion response.
arXiv Detail & Related papers (2025-11-24T19:21:09Z)
AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) Framework [31.261980405052938]
Assurance for artificial intelligence (AI) systems remains fragmented across software supply-chain security, adversarial machine learning, and governance documentation.<n>This paper introduces the AI Risk Scanning (AIRS) Framework, a threat-model-based, evidence-generating framework designed to operationalize AI assurance.
arXiv Detail & Related papers (2025-11-16T16:10:38Z)
Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols [80.68060125494645]
We study adaptive attacks by an untrusted model that knows the protocol and the monitor model.<n>We instantiate a simple adaptive attack vector by which the attacker embeds publicly known or zero-shot prompt injections in the model outputs.
arXiv Detail & Related papers (2025-10-10T15:12:44Z)
Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE [64.47951172662745]
Cuckoo Attack is a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files.<n>We formalize our attack paradigm into two stages, including initial infection and persistence.<n>We contribute seven actionable checkpoints for vendors to evaluate their product security.
arXiv Detail & Related papers (2025-09-19T04:10:52Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.