Related papers: A Taxonomy of AgentOps for Enabling Observability of Foundation Model based Agents

A Taxonomy of AgentOps for Enabling Observability of Foundation Model based Agents

URL: http://arxiv.org/abs/2411.05285v1
Date: Fri, 08 Nov 2024 02:31:03 GMT
Title: A Taxonomy of AgentOps for Enabling Observability of Foundation Model based Agents
Authors: Liming Dong, Qinghua Lu, Liming Zhu,
Abstract summary: LLMs have fueled the growth of a diverse range of downstream tasks, leading to an increased demand for AI automation. As AI agent systems tackle more complex tasks and evolve, they involve a wider range of stakeholders. These systems integrate multiple components such as AI agent, RAG pipelines, prompt management, agent capabilities, and observability features. It is essential to shift towards designing AgentOps platforms that ensure observability and traceability across the entire development-to-production life-cycle.
Score: 12.49728300301026
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ever-improving quality of LLMs has fueled the growth of a diverse range of downstream tasks, leading to an increased demand for AI automation and a burgeoning interest in developing foundation model (FM)-based autonomous agents. As AI agent systems tackle more complex tasks and evolve, they involve a wider range of stakeholders, including agent users, agentic system developers and deployers, and AI model developers. These systems also integrate multiple components such as AI agent workflows, RAG pipelines, prompt management, agent capabilities, and observability features. In this case, obtaining reliable outputs and answers from these agents remains challenging, necessitating a dependable execution process and end-to-end observability solutions. To build reliable AI agents and LLM applications, it is essential to shift towards designing AgentOps platforms that ensure observability and traceability across the entire development-to-production life-cycle. To this end, we conducted a rapid review and identified relevant AgentOps tools from the agentic ecosystem. Based on this review, we provide an overview of the essential features of AgentOps and propose a comprehensive overview of observability data/traceable artifacts across the agent production life-cycle. Our findings provide a systematic overview of the current AgentOps landscape, emphasizing the critical role of observability/traceability in enhancing the reliability of autonomous agent systems.

Related papers

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training [67.895981259683]
General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence.<n>Current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools.<n>We present Cognitive Kernel-Pro, a fully open-source and (to the maximum extent) free multi-module agent framework.
arXiv Detail & Related papers (2025-08-01T08:11:31Z)
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence [87.08051686357206]
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static.<n>As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck.<n>This survey provides the first systematic and comprehensive review of self-evolving agents.
arXiv Detail & Related papers (2025-07-28T17:59:05Z)
OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z)
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems [2.462408812529728]
This review presents a structured analysis of textbfTrust, Risk, and Security Management (TRiSM) in the context of LLM-based Agentic Multi-Agent Systems (AMAS)<n>We begin by examining the conceptual foundations of Agentic AI and highlight its architectural distinctions from traditional AI agents.<n>We then adapt and extend the AI TRiSM framework for Agentic AI, structured around four key pillars: Explainability, ModelOps, Security, Privacy and Governance.
arXiv Detail & Related papers (2025-06-04T16:26:11Z)
Agentic AI Process Observability: Discovering Behavioral Variability [2.273531916003657]
AI agents that leverage Large Language Models (LLMs) are increasingly becoming core building blocks of modern software systems.<n> frameworks enable the definition of agent setups using natural language prompting.<n>Within such setups, agent behavior is non-deterministic for any given input.
arXiv Detail & Related papers (2025-05-26T15:26:07Z)
A Survey on Trustworthy LLM Agents: Threats and Countermeasures [67.23228612512848]
Large Language Models (LLMs) and Multi-agent Systems (MAS) have significantly expanded the capabilities of LLM ecosystems. We propose the TrustAgent framework, a comprehensive study on the trustworthiness of agents.
arXiv Detail & Related papers (2025-03-12T08:42:05Z)
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds [12.464941027105306]
AI for IT Operations (AIOps) aims to automate complex operational tasks, such as fault localization and root cause analysis, to reduce human workload and minimize customer impact. Recent advances in Large Language Models (LLMs) and AI agents are revolutionizing AIOps by enabling end-to-end and multitask automation. We present AIOPSLAB, a framework that deploys microservice cloud environments, injects faults, generates workloads, and exports telemetry data but also orchestrates these components and provides interfaces for interacting with and evaluating agents.
arXiv Detail & Related papers (2025-01-12T04:17:39Z)
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance [95.03771007780976]
We tackle the challenge of developing proactive agents capable of anticipating and initiating tasks without explicit human instructions. First, we collect real-world human activities to generate proactive task predictions. These predictions are labeled by human annotators as either accepted or rejected. The labeled data is used to train a reward model that simulates human judgment.
arXiv Detail & Related papers (2024-10-16T08:24:09Z)
Agent-as-a-Judge: Evaluate Agents with Agents [61.33974108405561]
We introduce the Agent-as-a-Judge framework, wherein agentic systems are used to evaluate agentic systems. This is an organic extension of the LLM-as-a-Judge framework, incorporating agentic features that enable intermediate feedback for the entire task-solving process. We present DevAI, a new benchmark of 55 realistic automated AI development tasks.
arXiv Detail & Related papers (2024-10-14T17:57:02Z)
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [117.94654815220404]
G"odel Agent is a self-evolving framework inspired by the G"odel machine. G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z)
Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends [64.57762280003618]
It is foreseeable that in the near future, LM-driven general AI agents will serve as essential tools in production tasks. This paper investigates scenarios involving the autonomous collaboration of future LM agents.
arXiv Detail & Related papers (2024-09-22T14:09:49Z)
LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Integration of Multi Active/Passive Core-Agents [0.0]
We propose a novel LLM-based Agent Unified Modeling Framework (LLM-Agent-UMF) Our framework distinguishes between the different components of an LLM-based agent, setting LLMs and tools apart from a new element, the core-agent. We evaluate our framework by applying it to thirteen state-of-the-art agents, thereby demonstrating its alignment with their functionalities.
arXiv Detail & Related papers (2024-09-17T17:54:17Z)
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents. We propose the Internet of Agents (IoA), a novel framework that addresses these limitations. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z)
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm. We show that EvoAgent can automatically generate multiple expert agents and significantly enhance the task-solving capabilities of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z)
Security of AI Agents [5.468745160706382]
We identify and describe potential vulnerabilities in AI agents in detail from a system security perspective. We introduce defense mechanisms corresponding to each vulnerability with design and experiments to evaluate their viability. This paper contextualizes the security issues in the current development of AI agents and delineates methods to make AI agents safer and more reliable.
arXiv Detail & Related papers (2024-06-12T23:16:45Z)
CACA Agent: Capability Collaboration based AI Agent [18.84686313298908]
We propose CACA Agent (Capability Collaboration based AI Agent) using an open architecture inspired by service computing. CACA Agent integrates a set of collaborative capabilities to implement AI Agents, not only reducing the dependence on a single LLM. We present a demo to illustrate the operation and the application scenario extension of CACA Agent.
arXiv Detail & Related papers (2024-03-22T11:42:47Z)
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models [33.59597020276034]
Humans excel in critical thinking, planning, reflection, and harnessing available tools to interact with and interpret the world. Recent advancements in large language models (LLMs) suggest that machines might also possess the aforementioned human-like capabilities. We introduce KwaiAgents, a generalized information-seeking agent system based on LLMs.
arXiv Detail & Related papers (2023-12-08T08:11:11Z)
The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI) We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z)
AGI Agent Safety by Iteratively Improving the Utility Function [0.0]
We present an AGI safety layer that creates a special dedicated input terminal to support the iterative improvement of an AGI agent's utility function. We show ongoing work in mapping it to a Causal Influence Diagram (CID) We then present the design of a learning agent, a design that wraps the safety layer around either a known machine learning system, or a potential future AGI-level learning system.
arXiv Detail & Related papers (2020-07-10T14:30:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.