Related papers: EvoFlow: Evolving Diverse Agentic Workflows On The Fly

EvoFlow: Evolving Diverse Agentic Workflows On The Fly

URL: http://arxiv.org/abs/2502.07373v1
Date: Tue, 11 Feb 2025 08:48:46 GMT
Title: EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Authors: Guibin Zhang, Kaijie Chen, Guancheng Wan, Heng Chang, Hong Cheng, Kun Wang, Shuyue Hu, Lei Bai,
Abstract summary: EvoFlow is a niching evolutionary algorithm-based framework to automatically search a population of complexity and heterogeneous agentic.<n>We show that EvoFlow can evolve a population ranging from simple I/O tasks to complex multi-turn interactions.
Score: 21.82515160298748
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The past two years have witnessed the evolution of large language model (LLM)-based multi-agent systems from labor-intensive manual design to partial automation (\textit{e.g.}, prompt engineering, communication topology) and eventually to fully automated design. However, existing agentic automation pipelines often lack LLM heterogeneity and focus on single-objective performance optimization, limiting their potential to combine weaker models for more customized and cost-effective solutions. To address this challenge, we propose EvoFlow, a niching evolutionary algorithm-based framework to automatically search a population of heterogeneous and complexity-adaptive agentic workflows, rather than a single homogeneous, complex workflow. Technically, EvoFlow performs \textit{(1) tag-based retrieval} to extract parent workflows from an agentic population, evolves new workflows through \textit{(2) crossover} and \textit{(3) mutation}, and employs \textit{(4) niching-based selection} to maintain population diversity and quality. Extensive evaluations across seven benchmarks demonstrate that EvoFlow is: \textbf{(I) diverse}, evolving a population of workflows ranging from simple I/O tasks to complex multi-turn interactions; \textbf{(II) high-performing}, outperforming previous handcrafted and automated workflows by $1.23\%\sim29.86\%$; \textbf{(III) economical}, surpassing powerful \llmname{o1-preview} at $12.4\%$ of its inference cost using weaker open-source models.

Related papers

Polymath: A Self-Optimizing Agent with Dynamic Hierarchical Workflow [6.636150750052998]
Large language models (LLMs) excel at solving complex tasks by executing agentic composed of detailed instructions and structured operations.<n>Many researchers have sought to automate the generation and optimization of these through code-based representations.<n>Existing methods often rely on labeled datasets to train and optimize, making them ineffective and inflexible for solving real-world, dynamic problems.
arXiv Detail & Related papers (2025-08-04T23:50:02Z)
SEW: Self-Evolving Agentic Workflows for Automated Code Generation [24.16770109875788]
We propose textbfSelf-textbfEvolving textbfWork (textbfSEW), a novel framework that automatically generates and optimises multi-agentflow.<n>Our SEW can automatically design agentic and optimise them through self-evolution, bringing up to 33% improvement on LiveCodeBench.
arXiv Detail & Related papers (2025-05-24T11:12:14Z)
Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents [65.36060818857109]
We present a novel framework for extracting and evaluating dialog from historical interactions. Our extraction process consists of two key stages: (1) a retrieval step to select relevant conversations based on key procedural elements, and (2) a structured workflow generation process using a question-answer-based chain-of-thought (QA-CoT) prompting.
arXiv Detail & Related papers (2025-02-24T16:55:15Z)
Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning [6.328780056857816]
gen-AI that involve multiple ML model calls, tool/API calls, data retrieval, or generic code execution are often tuned manually in an ad-hoc way. AdaSeek organizes workflow tuning methods into different layers based on the user-specified total search budget. Cognify improves these workflow's generation quality by up to 2.8x, reduces execution monetary cost by up to 10x, and reduces end-to-end latency by 2.7x.
arXiv Detail & Related papers (2025-02-12T01:36:27Z)
Flow: A Modular Approach to Automated Agentic Workflow Generation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution.<n>However, the effective adjustment of Agentic during execution has not been well-studied.
arXiv Detail & Related papers (2025-01-14T04:35:37Z)
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration. It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z)
AFlow: Automating Agentic Workflow Generation [36.61172223528231]
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains. We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search. Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-14T17:40:40Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms. We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation [87.39861573270173]
We introduce the novel task of prompt-adaptive workflow generation, where the goal is to automatically tailor a workflow to each user prompt. We propose two LLM-based approaches to tackle this task: a tuning-based method that learns from user-preference data, and a training-free method that uses the LLM to select existing flows. Our work shows that prompt-dependent flow prediction offers a new pathway to improving text-to-image generation quality, complementing existing research directions in the field.
arXiv Detail & Related papers (2024-10-02T16:43:24Z)
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems [80.69865295743149]
This work attempts to study using LLM-based agents to design collaborative AI systems autonomously.<n>Based on ComfyBench, we develop ComfyAgent, a framework that empowers agents to autonomously design collaborative AI systems by generating.<n>While ComfyAgent achieves a comparable resolve rate to o1-preview and significantly surpasses other agents on ComfyBench, ComfyAgent has resolved only 15% of creative tasks.
arXiv Detail & Related papers (2024-09-02T17:44:10Z)
AutoFlow: Automated Workflow Generation for Large Language Model Agents [39.72700864347576]
Large Language Models (LLMs) have shown significant progress in understanding complex natural language. To make sure LLM Agents follow an effective and reliable procedure to solve the given task, manually designed are usually used. We propose AutoFlow, a framework designed to automatically generate for agents to solve complex tasks.
arXiv Detail & Related papers (2024-07-01T21:05:02Z)
FlowMind: Automatic Workflow Generation with LLMs [12.848562107014093]
This paper introduces a novel approach, FlowMind, leveraging the capabilities of Large Language Models (LLMs) We propose a generic prompt recipe for a lecture that helps ground LLM reasoning with reliable Application Programming Interfaces (APIs) We also introduce NCEN-QA, a new dataset in finance for benchmarking question-answering tasks from N-CEN reports on funds.
arXiv Detail & Related papers (2024-03-17T00:36:37Z)
Grammar-based evolutionary approach for automated workflow composition with domain-specific operators and ensemble diversity [0.36832029288386137]
This paper introduces EvoFlow, a grammar-based evolutionary approach for automatic workflow composition (AWC) EvoFlow enhances the flexibility in designing workflow structures, empowering practitioners to select algorithms that best fit their specific requirements. Our findings show that EvoFlow's specialised genetic operators and updating mechanism substantially outperform current leading methods.
arXiv Detail & Related papers (2024-02-03T11:29:14Z)
AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging) It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data. Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.