Related papers: Inefficiencies of Meta Agents for Agent Design

Inefficiencies of Meta Agents for Agent Design

URL: http://arxiv.org/abs/2510.06711v1
Date: Wed, 08 Oct 2025 07:06:17 GMT
Title: Inefficiencies of Meta Agents for Agent Design
Authors: Batu El, Mert Yuksekgonul, James Zou,
Abstract summary: We examine three key challenges in a common class of meta-agents.<n>First, we investigate how a meta-agent learns across iterations.<n>Second, although the meta-agent designs multiple agents during training, it typically commits to a single agent at test time.
Score: 25.46718879564119
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent works began to automate the design of agentic systems using meta-agents that propose and iteratively refine new agent architectures. In this paper, we examine three key challenges in a common class of meta-agents. First, we investigate how a meta-agent learns across iterations and find that simply expanding the context with all previous agents, as proposed by previous works, performs worse than ignoring prior designs entirely. We show that the performance improves with an evolutionary approach. Second, although the meta-agent designs multiple agents during training, it typically commits to a single agent at test time. We find that the designed agents have low behavioral diversity, limiting the potential for their complementary use. Third, we assess when automated design is economically viable. We find that only in a few cases--specifically, two datasets--the overall cost of designing and deploying the agents is lower than that of human-designed agents when deployed on over 15,000 examples. In contrast, the performance gains for other datasets do not justify the design cost, regardless of scale.

Related papers

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent [57.10083973844841]
AgentArk is a novel framework to distill multi-agent dynamics into the weights of a single model.<n>We investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios.<n>By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents.
arXiv Detail & Related papers (2026-02-03T19:18:28Z)
ReCreate: Reasoning and Creating Domain Agents Driven by Experience [14.353866611611672]
ReCreate is an experience-driven framework for the automatic creation of domain agents.<n>We introduce an agent-as-optimizer paradigm that effectively learns from experience.<n>In experiments across diverse domains, ReCreate consistently outperforms human-designed agents.
arXiv Detail & Related papers (2026-01-16T09:00:03Z)
Alita-G: Self-Evolving Generative Agent for Agent Generation [54.49365835457433]
We present ALITA-G, a framework that transforms a general-purpose agent into a domain expert.<n>In this framework, a generalist agent executes a curated suite of target-domain tasks.<n>It attains strong gains while reducing computation costs.
arXiv Detail & Related papers (2025-10-27T17:59:14Z)
Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation [87.47155146067962]
We provide a standardized evaluation harness that orchestrates parallel evaluations across hundreds of tasks.<n>We conduct three-dimensional analysis spanning models, scaffolds, and benchmarks.<n>Our analysis reveals surprising insights, such as higher reasoning effort reducing accuracy in the majority of runs.
arXiv Detail & Related papers (2025-10-13T22:22:28Z)
OAgents: An Empirical Study of Building Effective Agents [46.50371876218872]
We study the impact of popular design choices in key agent components in a fair and rigorous manner.<n>Based on our findings, we build and open-source OAgents, a new foundation agent framework.
arXiv Detail & Related papers (2025-06-17T17:59:02Z)
AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search [58.98450205734779]
Large language model (LLM) agents have demonstrated strong capabilities across diverse domains.<n>Existing agent search methods suffer from three major limitations.<n>We introduce a comprehensive framework to address these challenges.
arXiv Detail & Related papers (2025-06-06T12:07:23Z)
Towards Adaptive Software Agents for Debugging [0.40964539027092917]
We propose an adaptive agentic design, where the number of agents and their roles are determined dynamically.<n>Our initial evaluation shows that, with the adaptive design, the number of agents that are generated depends on the complexity of the buggy code.<n> Regarding the effectiveness of the fix, we noticed an average improvement of 11% compared to the one-shot prompting.
arXiv Detail & Related papers (2025-04-25T12:48:08Z)
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems [1.079505444748609]
We present our work on building a novel web agent, Agent-E. Agent-E introduces numerous architectural improvements over prior state-of-the-art web agents. We show that Agent-E beats other SOTA text and multi-modal web agents on this benchmark in most categories by 10-30%.
arXiv Detail & Related papers (2024-07-17T21:44:28Z)
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [116.97648507802926]
Large language models (LLMs) are considered a promising foundation to build such agents. We take the first step towards building generally-capable LLM-based agents with self-evolution ability. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration.
arXiv Detail & Related papers (2024-06-06T15:15:41Z)
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [56.00992369295851]
Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents. This paper delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations. We propose Agent-FLAN to effectively Fine-tune LANguage models for Agents.
arXiv Detail & Related papers (2024-03-19T16:26:10Z)
An Extensible Framework for Open Heterogeneous Collaborative Perception [58.70875361688463]
Collaborative perception aims to mitigate the limitations of single-agent perception. In this paper, we introduce a new open heterogeneous problem: how to accommodate continually emerging new heterogeneous agent types into collaborative perception. We propose HEterogeneous ALliance (HEAL), a novel collaborative perception framework.
arXiv Detail & Related papers (2024-01-25T05:55:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.