SOP-Agent: Empower General Purpose AI Agent with Domain-Specific SOPs
- URL: http://arxiv.org/abs/2501.09316v1
- Date: Thu, 16 Jan 2025 06:14:58 GMT
- Title: SOP-Agent: Empower General Purpose AI Agent with Domain-Specific SOPs
- Authors: Anbang Ye, Qianran Ma, Jia Chen, Muqi Li, Tong Li, Fujiao Liu, Siqi Mai, Meichen Lu, Haitao Bao, Yang You,
- Abstract summary: General-purpose AI agents struggle to efficiently utilize domain-specific knowledge and human expertise.<n>We introduce the Standard Operational Procedure-guided Agent ( SOP-agent), a novel framework for constructing domain-specific agents.<n> SOP-agent demonstrates excellent versatility, achieving performance superior to general-purpose agent frameworks.
- Score: 9.117180930298813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant advancements in general-purpose AI agents, several challenges still hinder their practical application in real-world scenarios. First, the limited planning capabilities of Large Language Models (LLM) restrict AI agents from effectively solving complex tasks that require long-horizon planning. Second, general-purpose AI agents struggle to efficiently utilize domain-specific knowledge and human expertise. In this paper, we introduce the Standard Operational Procedure-guided Agent (SOP-agent), a novel framework for constructing domain-specific agents through pseudocode-style Standard Operational Procedures (SOPs) written in natural language. Formally, we represent a SOP as a decision graph, which is traversed to guide the agent in completing tasks specified by the SOP. We conduct extensive experiments across tasks in multiple domains, including decision-making, search and reasoning, code generation, data cleaning, and grounded customer service. The SOP-agent demonstrates excellent versatility, achieving performance superior to general-purpose agent frameworks and comparable to domain-specific agent systems. Additionally, we introduce the Grounded Customer Service Benchmark, the first benchmark designed to evaluate the grounded decision-making capabilities of AI agents in customer service scenarios based on SOPs.
Related papers
- Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents [64.75036903373712]
Proposer-Agent-Evaluator is a learning system that enables foundation model agents to autonomously discover and practice skills in the wild.<n>At the heart of PAE is a context-aware task proposer that autonomously proposes tasks for the agent to practice with context information.<n>The success evaluation serves as the reward signal for the agent to refine its policies through RL.
arXiv Detail & Related papers (2024-12-17T18:59:50Z) - AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents [52.13695464678006]
This study enhances an LLM-based web agent by simply refining its observation and action space.
AgentOccam surpasses the previous state-of-the-art and concurrent work by 9.8 (+29.4%) and 5.9 (+15.8%) absolute points respectively.
arXiv Detail & Related papers (2024-10-17T17:50:38Z) - Agent S: An Open Agentic Framework that Uses Computers Like a Human [31.16046798529319]
We present Agent S, an open agentic framework that enables autonomous interaction with computers through a Graphical User Interface (GUI)
Agent S aims to address three key challenges in automating computer tasks: acquiring domain-specific knowledge, planning over long task horizons, and handling dynamic, non-uniform interfaces.
arXiv Detail & Related papers (2024-10-10T17:43:51Z) - Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems.
In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy.
Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z) - Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only [21.054681757006385]
We propose an agent that perceives its environment solely through screenshot images.<n>By leveraging the reasoning capability of the Large Language Models, we eliminate the need for large-scale human demonstration data.<n>Agent achieves an average success rate of 94.5% on MiniWoB++ and an average task score of 62.3 on WebShop.
arXiv Detail & Related papers (2024-06-11T05:21:20Z) - CACA Agent: Capability Collaboration based AI Agent [18.84686313298908]
We propose CACA Agent (Capability Collaboration based AI Agent) using an open architecture inspired by service computing.
CACA Agent integrates a set of collaborative capabilities to implement AI Agents, not only reducing the dependence on a single LLM.
We present a demo to illustrate the operation and the application scenario extension of CACA Agent.
arXiv Detail & Related papers (2024-03-22T11:42:47Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z) - Toward Policy Explanations for Multi-Agent Reinforcement Learning [18.33682005623418]
We present novel methods to generate two types of policy explanations for MARL.
Experimental results on three MARL domains demonstrate the scalability of our methods.
A user study shows that the generated explanations significantly improve user performance and increase subjective ratings on metrics such as user satisfaction.
arXiv Detail & Related papers (2022-04-26T20:07:08Z) - Modelling Multi-Agent Epistemic Planning in ASP [66.76082318001976]
This paper presents an implementation of a multi-shot Answer Set Programming-based planner that can reason in multi-agent epistemic settings.
The paper shows how the planner, exploiting an ad-hoc epistemic state representation and the efficiency of ASP solvers, has competitive performance results on benchmarks collected from the literature.
arXiv Detail & Related papers (2020-08-07T06:35:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.