AI Agent Systems: Architectures, Applications, and Evaluation
- URL: http://arxiv.org/abs/2601.01743v1
- Date: Mon, 05 Jan 2026 02:38:40 GMT
- Title: AI Agent Systems: Architectures, Applications, and Evaluation
- Authors: Bin Xu,
- Abstract summary: AI agents combine foundation models with reasoning, planning, memory, and tool use.<n>We organize prior work into a unified taxonomy spanning agent components.<n>We discuss key design trade-offs -- latency vs. accuracy, autonomy vs. controllability, and capability vs. reliability.
- Score: 4.967019713320407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI agents -- systems that combine foundation models with reasoning, planning, memory, and tool use -- are rapidly becoming a practical interface between natural-language intent and real-world computation. This survey synthesizes the emerging landscape of AI agent architectures across: (i) deliberation and reasoning (e.g., chain-of-thought-style decomposition, self-reflection and verification, and constraint-aware decision making), (ii) planning and control (from reactive policies to hierarchical and multi-step planners), and (iii) tool calling and environment interaction (retrieval, code execution, APIs, and multimodal perception). We organize prior work into a unified taxonomy spanning agent components (policy/LLM core, memory, world models, planners, tool routers, and critics), orchestration patterns (single-agent vs.\ multi-agent; centralized vs.\ decentralized coordination), and deployment settings (offline analysis vs.\ online interactive assistance; safety-critical vs.\ open-ended tasks). We discuss key design trade-offs -- latency vs.\ accuracy, autonomy vs.\ controllability, and capability vs.\ reliability -- and highlight how evaluation is complicated by non-determinism, long-horizon credit assignment, tool and environment variability, and hidden costs such as retries and context growth. Finally, we summarize measurement and benchmarking practices (task suites, human preference and utility metrics, success under constraints, robustness and security) and identify open challenges including verification and guardrails for tool actions, scalable memory and context management, interpretability of agent decisions, and reproducible evaluation under realistic workloads.
Related papers
- The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z) - Agentic Reasoning for Large Language Models [122.81018455095999]
Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making.<n>Large language models (LLMs) demonstrate strong reasoning capabilities in closed-world settings, but struggle in open-ended and dynamic environments.<n>Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction.
arXiv Detail & Related papers (2026-01-18T18:58:23Z) - The Path Ahead for Agentic AI: Challenges and Opportunities [4.52683540940001]
This chapter examines the emergence of agentic AI systems that operate autonomously in complex environments.<n>We trace the architectural progression from statistical models to transformer-based systems, identifying capabilities that enable agentic behavior.<n>Unlike existing surveys, we focus on the architectural transition from language understanding to autonomous action, emphasizing the technical gaps that must be resolved before deployment.
arXiv Detail & Related papers (2026-01-06T06:31:42Z) - Architectures for Building Agentic AI [0.0]
This chapter argues that the reliability of agentic and generative AI is chiefly an architectural property.<n>Building on classical foundations, we propose a practical taxonomy-tool-using agents, memory-augmented agents, planning and self-improvement agents, multi-agent systems, and embodied or web agents.
arXiv Detail & Related papers (2025-12-10T09:28:40Z) - Towards 6G Native-AI Edge Networks: A Semantic-Aware and Agentic Intelligence Paradigm [85.7583231789615]
6G positions intelligence as a native network capability, transforming the design of radio access networks (RANs)<n>Within this vision, Semantic-native communication and agentic intelligence are expected to play central roles.<n>Agentic intelligence endows distributed RAN entities with goal-driven autonomy, reasoning, planning, and multi-agent collaboration.
arXiv Detail & Related papers (2025-12-04T03:09:33Z) - AI Agentic Programming: A Survey of Techniques, Challenges, and Opportunities [8.086360127362815]
Large language model (LLM)-based coding agents autonomously plan, execute, and interact with tools such as compilers, debuggers, and version control systems.<n>Unlike conventional code generation, these agents decompose goals, coordinate multi-step processes, and adapt based on feedback, reshaping software development practices.
arXiv Detail & Related papers (2025-08-15T00:14:31Z) - A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence [87.08051686357206]
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static.<n>As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck.<n>This survey provides the first systematic and comprehensive review of self-evolving agents.
arXiv Detail & Related papers (2025-07-28T17:59:05Z) - Deep Research Agents: A Systematic Examination And Roadmap [109.53237992384872]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z) - Towards Pervasive Distributed Agentic Generative AI -- A State of The Art [0.0]
The rapid advancement of intelligent agents and Large Language Models (LLMs) is reshaping the pervasive computing field.<n>This survey outlines the architectural components of LLM agents and examines their deployment and evaluation across various scenarios.<n>It highlights state-of-the-art agent deployment strategies and applications, including local and distributed execution on resource-constrained devices.
arXiv Detail & Related papers (2025-06-16T10:15:06Z) - HADA: Human-AI Agent Decision Alignment Architecture [0.0]
HADA is a protocol- and framework reference architecture that keeps both large language model (LLM) agents and legacy algorithms aligned with organizational targets and values.<n>Technical and non-technical actors can query, steer, audit, or contest every decision across strategic, tactical, and real-time horizons.
arXiv Detail & Related papers (2025-06-01T14:04:52Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.