Related papers: Do We Always Need Query-Level Workflows? Rethinking Agentic Workflow Generation for Multi-Agent Systems

Do We Always Need Query-Level Workflows? Rethinking Agentic Workflow Generation for Multi-Agent Systems

URL: http://arxiv.org/abs/2601.11147v1
Date: Fri, 16 Jan 2026 10:05:51 GMT
Title: Do We Always Need Query-Level Workflows? Rethinking Agentic Workflow Generation for Multi-Agent Systems
Authors: Zixu Wang, Bingbing Xu, Yige Yuan, Huawei Shen, Xueqi Cheng,
Abstract summary: Multi-Agent Systems (MAS) solve complex tasks by coordinating multiple agents through.<n>Existing approaches generates either at task level or query level, but their relative costs and benefits remain unclear.<n>We show that query-level workflow generation is not always necessary, since a small set of top-K best task-level together already covers equivalent or even more queries.
Score: 72.3575737073235
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-Agent Systems (MAS) built on large language models typically solve complex tasks by coordinating multiple agents through workflows. Existing approaches generates workflows either at task level or query level, but their relative costs and benefits remain unclear. After rethinking and empirical analyses, we show that query-level workflow generation is not always necessary, since a small set of top-K best task-level workflows together already covers equivalent or even more queries. We further find that exhaustive execution-based task-level evaluation is both extremely token-costly and frequently unreliable. Inspired by the idea of self-evolution and generative reward modeling, we propose a low-cost task-level generation framework \textbf{SCALE}, which means \underline{\textbf{S}}elf prediction of the optimizer with few shot \underline{\textbf{CAL}}ibration for \underline{\textbf{E}}valuation instead of full validation execution. Extensive experiments demonstrate that \textbf{SCALE} maintains competitive performance, with an average degradation of just 0.61\% compared to existing approach across multiple datasets, while cutting overall token usage by up to 83\%.

Related papers

Learning to Compose for Cross-domain Agentic Workflow Generation [56.630382886594184]
We create an open-source LLM for cross-domain workflow generation.<n>We learn a compact set of reusable workflow capabilities across diverse domains.<n>Our 1-pass generator surpasses SOTA refinement baselines that consume 20 iterations.
arXiv Detail & Related papers (2026-02-11T18:27:22Z)
Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline [38.16649115214312]
We show that a single agent can reach the performance of homogeneous with an efficiency advantage from KV cache reuse.<n>We propose an algorithm that automatically tailors for single-agent execution, reducing inference costs.
arXiv Detail & Related papers (2026-01-18T08:16:09Z)
CoDA: Agentic Systems for Collaborative Data Visualization [57.270599188947294]
Deep research has revolutionized data analysis, yet data scientists still devote substantial time to manually crafting visualizations.<n>Existing approaches, including simple single- or multi-agent systems, often oversimplify the task.<n>We introduce CoDA, a multi-agent system that employs specialized LLM agents for metadata analysis, task planning, code generation, and self-reflection.
arXiv Detail & Related papers (2025-10-03T17:30:16Z)
Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling [83.78874399606379]
We propose MACT, a Multi-Agent Collaboration framework with Test-Time scaling.<n>It comprises four distinct small-scale agents, with clearly defined roles and effective collaboration.<n>It shows superior performance with a smaller parameter scale without sacrificing the ability of general and mathematical tasks.
arXiv Detail & Related papers (2025-08-05T12:52:09Z)
Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding [56.565200973244146]
Agentic Predictor is a lightweight predictor for efficient agentic workflow evaluation.<n>By learning to approximate task success rates, Agentic Predictor enables fast and accurate selection of optimal agentic workflow configurations.
arXiv Detail & Related papers (2025-05-26T09:46:50Z)
Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning [6.328780056857816]
gen-AI that involve multiple ML model calls, tool/API calls, data retrieval, or generic code execution are often tuned manually in an ad-hoc way.<n>AdaSeek organizes workflow tuning methods into different layers based on the user-specified total search budget.<n>Cognify improves these workflow's generation quality by up to 2.8x, reduces execution monetary cost by up to 10x, and reduces end-to-end latency by 2.7x.
arXiv Detail & Related papers (2025-02-12T01:36:27Z)
EvoFlow: Evolving Diverse Agentic Workflows On The Fly [21.82515160298748]
EvoFlow is a niching evolutionary algorithm-based framework to automatically search a population of complexity and heterogeneous agentic.<n>We show that EvoFlow can evolve a population ranging from simple I/O tasks to complex multi-turn interactions.
arXiv Detail & Related papers (2025-02-11T08:48:46Z)
AFlow: Automating Agentic Workflow Generation [36.61172223528231]
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains.<n>We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search.<n> Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-14T17:40:40Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.<n>We also present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.<n>We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination. We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.