Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
- URL: http://arxiv.org/abs/2407.13032v1
- Date: Wed, 17 Jul 2024 21:44:28 GMT
- Title: Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
- Authors: Tamer Abuelsaad, Deepak Akkil, Prasenjit Dey, Ashish Jagmohan, Aditya Vempaty, Ravi Kokku,
- Abstract summary: We present our work on building a novel web agent, Agent-E.
Agent-E introduces numerous architectural improvements over prior state-of-the-art web agents.
We show that Agent-E beats other SOTA text and multi-modal web agents on this benchmark in most categories by 10-30%.
- Score: 1.079505444748609
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI Agents are changing the way work gets done, both in consumer and enterprise domains. However, the design patterns and architectures to build highly capable agents or multi-agent systems are still developing, and the understanding of the implication of various design choices and algorithms is still evolving. In this paper, we present our work on building a novel web agent, Agent-E \footnote{Our code is available at \url{https://github.com/EmergenceAI/Agent-E}}. Agent-E introduces numerous architectural improvements over prior state-of-the-art web agents such as hierarchical architecture, flexible DOM distillation and denoising method, and the concept of \textit{change observation} to guide the agent towards more accurate performance. We first present the results of an evaluation of Agent-E on WebVoyager benchmark dataset and show that Agent-E beats other SOTA text and multi-modal web agents on this benchmark in most categories by 10-30\%. We then synthesize our learnings from the development of Agent-E into general design principles for developing agentic systems. These include the use of domain-specific primitive skills, the importance of distillation and de-noising of environmental observations, the advantages of a hierarchical architecture, and the role of agentic self-improvement to enhance agent efficiency and efficacy as the agent gathers experience.
Related papers
- Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm.
We show that EvoAgent can automatically generate multiple expert agents and significantly enhance the task-solving capabilities of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z) - AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [116.97648507802926]
Large language models (LLMs) are considered a promising foundation to build such agents.
We take the first step towards building generally-capable LLM-based agents with self-evolution ability.
We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration.
arXiv Detail & Related papers (2024-06-06T15:15:41Z) - The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey [0.0]
This paper examines the recent advancements in AI agent implementations.
It focuses on their ability to achieve complex goals that require enhanced reasoning, planning, and tool execution capabilities.
arXiv Detail & Related papers (2024-04-17T17:32:41Z) - AgentStudio: A Toolkit for Building General Virtual Agents [57.02375267926862]
We introduce AgentStudio, an online, realistic, and multimodal toolkit that covers the entire lifecycle of agent development.
This includes environment setups, data collection, agent evaluation, and visualization.
We have open-sourced the environments, datasets, benchmarks, and interfaces to promote research towards developing general virtual agents.
arXiv Detail & Related papers (2024-03-26T17:54:15Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - An In-depth Survey of Large Language Model-based Artificial Intelligence
Agents [11.774961923192478]
We have explored the core differences and characteristics between LLM-based AI agents and traditional AI agents.
We conducted an in-depth analysis of the key components of AI agents, including planning, memory, and tool use.
arXiv Detail & Related papers (2023-09-23T11:25:45Z) - The Rise and Potential of Large Language Model Based Agents: A Survey [91.71061158000953]
Large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI)
We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents.
We explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation.
arXiv Detail & Related papers (2023-09-14T17:12:03Z) - Toward a Reasoning and Learning Architecture for Ad Hoc Teamwork [4.454557728745761]
We present an architecture for ad hoc teamwork, which refers to collaboration in a team of agents without prior coordination.
Our architecture combines the principles of knowledge-based and data-driven reasoning and learning.
We use the benchmark simulated multiagent collaboration domain Fort Attack to demonstrate that our architecture supports adaptation to unforeseen changes.
arXiv Detail & Related papers (2022-08-24T13:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.