Related papers: Learning to Compose for Cross-domain Agentic Workflow Generation

Learning to Compose for Cross-domain Agentic Workflow Generation

URL: http://arxiv.org/abs/2602.11114v1
Date: Wed, 11 Feb 2026 18:27:22 GMT
Title: Learning to Compose for Cross-domain Agentic Workflow Generation
Authors: Jialiang Wang, Shengxiang Xu, Hanmo Liu, Jiachuan Wang, Yuyu Luo, Shimin Di, Min-Ling Zhang, Lei Chen,
Abstract summary: We create an open-source LLM for cross-domain workflow generation.<n>We learn a compact set of reusable workflow capabilities across diverse domains.<n>Our 1-pass generator surpasses SOTA refinement baselines that consume 20 iterations.
Score: 56.630382886594184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatically generating agentic workflows -- executable operator graphs or codes that orchestrate reasoning, verification, and repair -- has become a practical way to solve complex tasks beyond what single-pass LLM generation can reliably handle. Yet what constitutes a good workflow depends heavily on the task distribution and the available operators. Under domain shift, current systems typically rely on iterative workflow refinement to discover a feasible workflow from a large workflow space, incurring high iteration costs and yielding unstable, domain-specific behavior. In response, we internalize a decompose-recompose-decide mechanism into an open-source LLM for cross-domain workflow generation. To decompose, we learn a compact set of reusable workflow capabilities across diverse domains. To recompose, we map each input task to a sparse composition over these bases to generate a task-specific workflow in a single pass. To decide, we attribute the success or failure of workflow generation to counterfactual contributions from learned capabilities, thereby capturing which capabilities actually drive success by their marginal effects. Across stringent multi-domain, cross-domain, and unseen-domain evaluations, our 1-pass generator surpasses SOTA refinement baselines that consume 20 iterations, while substantially reducing generation latency and cost.

Related papers

FlowMind: Execute-Summarize for Structured Workflow Generation from LLM Reasoning [5.153212048436295]
LLMs can solve complex tasks through reasoning and tool use, but accurately translating these solutions into structured remains challenging.<n>We model as sequences of tool use and reformulate the problem as designing a mechanism that can both solve tasks and reliably construct them.<n>We propose an Execute-Summarize(ES) framework that decouples task execution from workflow construction.
arXiv Detail & Related papers (2026-02-12T10:04:42Z)
Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline [38.16649115214312]
We show that a single agent can reach the performance of homogeneous with an efficiency advantage from KV cache reuse.<n>We propose an algorithm that automatically tailors for single-agent execution, reducing inference costs.
arXiv Detail & Related papers (2026-01-18T08:16:09Z)
Do We Always Need Query-Level Workflows? Rethinking Agentic Workflow Generation for Multi-Agent Systems [72.3575737073235]
Multi-Agent Systems (MAS) solve complex tasks by coordinating multiple agents through.<n>Existing approaches generates either at task level or query level, but their relative costs and benefits remain unclear.<n>We show that query-level workflow generation is not always necessary, since a small set of top-K best task-level together already covers equivalent or even more queries.
arXiv Detail & Related papers (2026-01-16T10:05:51Z)
CodeR3: A GenAI-Powered Workflow Repair and Revival Ecosystem [0.5249805590164902]
We present a novel legacy Reuse workflow migration system, called CodeR$3$ (stands for Code Repair, Revival and Reuse)<n>We use generative AI to analyze the characteristics of decayed, reproduce them into modern workflow technologies like Snakemake and VisFlow.<n>Our system additionally integrates stepwise workflow analysis, automated service substitution, visualization, and human-in-the-loop validation.
arXiv Detail & Related papers (2025-11-24T01:06:45Z)
DyFlow: Dynamic Workflow Framework for Agentic Reasoning [79.19799197382478]
DyFlow is a dynamic workflow generation framework that adaptively constructs and adjusts reasoning procedures based on task requirements and real-time intermediate feedback.<n>We systematically evaluate DyFlow across diverse domains, including social reasoning, biomedical tasks, mathematical problem solving, and code generation.<n>Results demonstrate that DyFlow significantly outperforms existing baselines, achieving substantial Pass@k improvements and exhibiting robust generalization across diverse domains.
arXiv Detail & Related papers (2025-09-30T10:36:23Z)
Flow: Modularized Agentic Workflow Automation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution.<n>However, the effective adjustment of agentic during execution has not been well studied.<n>In this paper, we define an activity-on-vertex (AOV) graph, which allows continuous workflow refinement by agents.<n>Our proposed multi-agent framework achieves efficient concurrent execution of subtasks, effective goal achievement, and enhanced error tolerance.
arXiv Detail & Related papers (2025-01-14T04:35:37Z)
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration. It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.<n>We also present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.<n>We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.