Related papers: AgentTrace: A Structured Logging Framework for Agent System Observability

AgentTrace: A Structured Logging Framework for Agent System Observability

URL: http://arxiv.org/abs/2602.10133v1
Date: Sat, 07 Feb 2026 04:04:59 GMT
Title: AgentTrace: A Structured Logging Framework for Agent System Observability
Authors: Adam AlSayyad, Kelvin Yuxiang Huang, Richik Pal,
Abstract summary: AgentTrace is a dynamic observability and telemetry framework designed to fill this gap.<n>Unlike traditional logging systems, AgentTrace emphasizes continuous, introspectable trace capture.<n>Our research highlights how AgentTrace can enable more reliable agent deployment, fine-grained risk analysis, and informed trust calibration.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the growing capabilities of autonomous agents powered by large language models (LLMs), their adoption in high-stakes domains remains limited. A key barrier is security: the inherently nondeterministic behavior of LLM agents defies static auditing approaches that have historically underpinned software assurance. Existing security methods, such as proxy-level input filtering and model glassboxing, fail to provide sufficient transparency or traceability into agent reasoning, state changes, or environmental interactions. In this work, we introduce AgentTrace, a dynamic observability and telemetry framework designed to fill this gap. AgentTrace instruments agents at runtime with minimal overhead, capturing a rich stream of structured logs across three surfaces: operational, cognitive, and contextual. Unlike traditional logging systems, AgentTrace emphasizes continuous, introspectable trace capture, designed not just for debugging or benchmarking, but as a foundational layer for agent security, accountability, and real-time monitoring. Our research highlights how AgentTrace can enable more reliable agent deployment, fine-grained risk analysis, and informed trust calibration, thereby addressing critical concerns that have so far limited the use of LLM agents in sensitive environments.

Related papers

TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents [47.147717604167376]
Trajectory Anomaly Detection essential for enabling efficient rollback-and-retry.<n>General-purpose LLMs struggle to identify and localize these anomalies.<n>We propose TrajAD, a specialized verifier trained with fine-grained process supervision.
arXiv Detail & Related papers (2026-02-06T07:13:49Z)
The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z)
Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z)
Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation [0.0]
The current paper includes an example of agentic artificial intelligence (AI) based on autonomous software supply chain security.<n>It combines large language model (LLM)-based reasoning, reinforcement learning (RL), and multi-agent coordination.<n>Results show that agentic AI can facilitate the transition to self defending, proactive software supply chains.
arXiv Detail & Related papers (2025-12-29T14:06:09Z)
AgentGuard: Runtime Verification of AI Agents [1.14219428942199]
AgentGuard is a framework for runtime verification of Agentic AI systems.<n>It provides continuous, quantitative assurance through a new paradigm called Dynamic Probabilistic Assurance.
arXiv Detail & Related papers (2025-09-28T13:08:50Z)
AgentSight: System-Level Observability for AI Agents Using eBPF [10.37440633887049]
Existing tools observe either an agent's high-level intent (via LLM prompts) or its low-level actions (e.g., system calls) but cannot correlate these two views.<n>We introduce AgentSight, an AgentOps observability framework that bridges this semantic gap using a hybrid approach.<n>AgentSight intercepts TLS-encrypted LLM traffic to extract semantic intent, monitors kernel events to observe system-wide effects, and causally correlates these two streams across process boundaries.
arXiv Detail & Related papers (2025-08-02T01:43:39Z)
OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)
CoTGuard: Using Chain-of-Thought Triggering for Copyright Protection in Multi-Agent LLM Systems [55.57181090183713]
We introduce CoTGuard, a novel framework for copyright protection that leverages trigger-based detection within Chain-of-Thought reasoning.<n>Specifically, we can activate specific CoT segments and monitor intermediate reasoning steps for unauthorized content reproduction by embedding specific trigger queries into agent prompts.<n>This approach enables fine-grained, interpretable detection of copyright violations in collaborative agent scenarios.
arXiv Detail & Related papers (2025-05-26T01:42:37Z)
AgentOps: Enabling Observability of LLM Agents [12.49728300301026]
Large language model (LLM) agents raise significant concerns on AI safety due to their autonomous and non-deterministic behavior.<n>We present a comprehensive taxonomy of AgentOps, identifying the artifacts and associated data that should be traced throughout the entire lifecycle of agents to achieve effective observability.<n>Our taxonomy serves as a reference template for developers to design and implement AgentOps infrastructure that supports monitoring, logging, and analytics.
arXiv Detail & Related papers (2024-11-08T02:31:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.