AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
- URL: http://arxiv.org/abs/2510.21031v1
- Date: Thu, 23 Oct 2025 22:32:03 GMT
- Title: AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
- Authors: Qinghua Lu, Dehai Zhao, Yue Liu, Hao Zhang, Liming Zhu, Xiwei Xu, Angela Shi, Tristan Tan, Rick Kazman,
- Abstract summary: We present AgentArcEval, a novel agent architecture evaluation method designed specially to address the complexities of FM-based agent architecture.<n>We also present a catalogue of agent-specific general scenarios, which serves as a guide for generating concrete scenarios to design and evaluate the agent architecture.
- Score: 25.51779417301816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The emergence of foundation models (FMs) has enabled the development of highly capable and autonomous agents, unlocking new application opportunities across a wide range of domains. Evaluating the architecture of agents is particularly important as the architectural decisions significantly impact the quality attributes of agents given their unique characteristics, including compound architecture, autonomous and non-deterministic behaviour, and continuous evolution. However, these traditional methods fall short in addressing the evaluation needs of agent architecture due to the unique characteristics of these agents. Therefore, in this paper, we present AgentArcEval, a novel agent architecture evaluation method designed specially to address the complexities of FM-based agent architecture and its evaluation. Moreover, we present a catalogue of agent-specific general scenarios, which serves as a guide for generating concrete scenarios to design and evaluate the agent architecture. We demonstrate the usefulness of AgentArcEval and the catalogue through a case study on the architecture evaluation of a real-world tax copilot, named Luna.
Related papers
- Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z) - Reimagining Agent-based Modeling with Large Language Model Agents via Shachi [16.625794969005966]
The study of emergent behaviors in large language model (LLM)-driven multi-agent systems is a critical research challenge.<n>We introduce Shachi, a formal methodology and modular framework that decomposes an agent's policy into core cognitive components.<n>We validate our methodology on a comprehensive 10-task benchmark and demonstrate its power through novel scientific inquiries.
arXiv Detail & Related papers (2025-09-26T04:38:59Z) - AgentArch: A Comprehensive Benchmark to Evaluate Agent Architectures in Enterprise [0.0]
We examine four critical agentic system dimensions: orchestration strategy, agent prompt implementation (ReAct versus function calling), memory architecture, and thinking tool integration.<n>Our benchmark reveals significant model-specific architectural preferences that challenge the prevalent one-size-fits-all paradigm in agentic AI systems.
arXiv Detail & Related papers (2025-09-13T01:18:23Z) - A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems [53.37728204835912]
Most existing AI systems rely on manually crafted configurations that remain static after deployment.<n>Recent research has explored agent evolution techniques that aim to automatically enhance agent systems based on interaction data and environmental feedback.<n>This survey aims to provide researchers and practitioners with a systematic understanding of self-evolving AI agents.
arXiv Detail & Related papers (2025-08-10T16:07:32Z) - Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training [67.895981259683]
General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence.<n>Current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools.<n>We present Cognitive Kernel-Pro, a fully open-source and (to the maximum extent) free multi-module agent framework.
arXiv Detail & Related papers (2025-08-01T08:11:31Z) - Deep Research Agents: A Systematic Examination And Roadmap [109.53237992384872]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z) - The Architecture Tradeoff and Risk Analysis Framework (ATRAF): A Unified Approach for Evaluating Software Architectures, Reference Architectures, and Architectural Frameworks [0.0]
We introduce the Architecture Tradeoff and Risk Analysis Framework (ATRAF)<n>ATRAF is a scenario-driven framework for evaluating tradeoffs and risks across architectural levels.<n>It enables the identification of sensitivities, tradeoffs, and risks while supporting continuous refinement of architectural artifacts.
arXiv Detail & Related papers (2025-05-01T17:48:52Z) - Large Language Model Agent: A Survey on Methodology, Applications and Challenges [88.3032929492409]
Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.<n>This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy.<n>Our work provides a unified architectural perspective, examining how agents are constructed, how they collaborate, and how they evolve over time.
arXiv Detail & Related papers (2025-03-27T12:50:17Z) - A Survey of Model Architectures in Information Retrieval [59.61734783818073]
The period from 2019 to the present has represented one of the biggest paradigm shifts in information retrieval (IR) and natural language processing (NLP)<n>We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs)<n>We conclude with a forward-looking discussion of emerging challenges and future directions.
arXiv Detail & Related papers (2025-02-20T18:42:58Z) - A Taxonomy of Architecture Options for Foundation Model-based Agents: Analysis and Decision Model [25.78239568393706]
This paper introduces a taxonomy focused on the architectures of foundation-model-based agents.
By unifying and detailing these classifications, our taxonomy aims to improve the design of foundation-model-based agents.
arXiv Detail & Related papers (2024-08-06T03:10:52Z) - Towards Responsible Generative AI: A Reference Architecture for Designing Foundation Model based Agents [28.406492378232695]
Foundation model based agents derive their autonomy from the capabilities of foundation models.
This paper presents a pattern-oriented reference architecture that serves as guidance when designing foundation model based agents.
arXiv Detail & Related papers (2023-11-22T04:21:47Z) - Kernel Based Cognitive Architecture for Autonomous Agents [91.3755431537592]
This paper considers an evolutionary approach to creating a cognitive functionality.
We consider a cognitive architecture which ensures the evolution of the agent on the basis of Symbol Emergence Problem solution.
arXiv Detail & Related papers (2022-07-02T12:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.