Related papers: AFlow: Automating Agentic Workflow Generation

AFlow: Automating Agentic Workflow Generation

URL: http://arxiv.org/abs/2410.10762v1
Date: Mon, 14 Oct 2024 17:40:40 GMT
Title: AFlow: Automating Agentic Workflow Generation
Authors: Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu,
Abstract summary: Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains. We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search. Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
Score: 36.61172223528231
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains, typically by employing agentic workflows that follow detailed instructions and operational sequences. However, constructing these workflows requires significant human effort, limiting scalability and generalizability. Recent research has sought to automate the generation and optimization of these workflows, but existing methods still rely on initial manual setup and fall short of achieving fully automated and effective workflow generation. To address this challenge, we reformulate workflow optimization as a search problem over code-represented workflows, where LLM-invoking nodes are connected by edges. We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search, iteratively refining workflows through code modification, tree-structured experience, and execution feedback. Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines. Furthermore, AFlow enables smaller models to outperform GPT-4o on specific tasks at 4.55% of its inference cost in dollars. The code will be available at https://github.com/geekan/MetaGPT.

Related papers

ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation [71.31634636156384]
We introduce ComfyGPT, the first self-optimizing multi-agent system designed to generate ComfyUI based on task descriptions automatically. ComfyGPT comprises four specialized agents: ReformatAgent, FlowAgent, RefineAgent, and ExecuteAgent. FlowDataset is a large-scale dataset containing 13,571 workflow-description pairs, and FlowBench is a benchmark for evaluating workflow generation systems.
arXiv Detail & Related papers (2025-03-22T06:48:50Z)
GNNs as Predictors of Agentic Workflow Performances [48.34485750450876]
Agentic invoked by Large Language Models (LLMs) have achieved remarkable success in handling complex tasks. This paper formulates agentic as computational graphs and advocates Graph Neural Networks (GNNs) as efficient predictors of agentic performances. We construct FLORA-Bench, a unified platform for benchmarking GNNs for predicting agentic workflow performances.
arXiv Detail & Related papers (2025-03-14T11:11:00Z)
Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning [6.328780056857816]
gen-AI that involve multiple ML model calls, tool/API calls, data retrieval, or generic code execution are often tuned manually in an ad-hoc way. AdaSeek organizes workflow tuning methods into different layers based on the user-specified total search budget. Cognify improves these workflow's generation quality by up to 2.8x, reduces execution monetary cost by up to 10x, and reduces end-to-end latency by 2.7x.
arXiv Detail & Related papers (2025-02-12T01:36:27Z)
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization [51.280919773837645]
We develop ScoreFlow, a high-performance framework for agent workflow optimization. ScoreFlow incorporates Score-DPO, a novel variant of the direct preference optimization method that accounts for quantitative feedback. It achieves an 8.2% improvement over existing baselines across question answering, coding, and mathematical reasoning.
arXiv Detail & Related papers (2025-02-06T18:47:49Z)
Flow: Modularized Agentic Workflow Automation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution. However, the effective adjustment of agentic during execution has not been well studied. In this paper, we define an activity-on-vertex (AOV) graph, which allows continuous workflow refinement by agents. Our proposed multi-agent framework achieves efficient concurrent execution of subtasks, effective goal achievement, and enhanced error tolerance.
arXiv Detail & Related papers (2025-01-14T04:35:37Z)
Opus: A Large Work Model for Complex Workflow Generation [0.0]
Opus is a framework for generating and optimizing tasks tailored to complex Business Process Outsourcing (BPO) use cases. Our approach generates executables from Intention, defined as the alignment of Client Input, Client Output and Process Directed Context.
arXiv Detail & Related papers (2024-11-30T20:00:41Z)
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration. It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms. We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation [87.39861573270173]
We introduce the novel task of prompt-adaptive workflow generation, where the goal is to automatically tailor a workflow to each user prompt. We propose two LLM-based approaches to tackle this task: a tuning-based method that learns from user-preference data, and a training-free method that uses the LLM to select existing flows. Our work shows that prompt-dependent flow prediction offers a new pathway to improving text-to-image generation quality, complementing existing research directions in the field.
arXiv Detail & Related papers (2024-10-02T16:43:24Z)
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering. Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z)
AutoFlow: Automated Workflow Generation for Large Language Model Agents [39.72700864347576]
Large Language Models (LLMs) have shown significant progress in understanding complex natural language. To make sure LLM Agents follow an effective and reliable procedure to solve the given task, manually designed are usually used. We propose AutoFlow, a framework designed to automatically generate for agents to solve complex tasks.
arXiv Detail & Related papers (2024-07-01T21:05:02Z)
FlowMind: Automatic Workflow Generation with LLMs [12.848562107014093]
This paper introduces a novel approach, FlowMind, leveraging the capabilities of Large Language Models (LLMs) We propose a generic prompt recipe for a lecture that helps ground LLM reasoning with reliable Application Programming Interfaces (APIs) We also introduce NCEN-QA, a new dataset in finance for benchmarking question-answering tasks from N-CEN reports on funds.
arXiv Detail & Related papers (2024-03-17T00:36:37Z)
Couler: Unified Machine Learning Workflow Optimization in Cloud [6.769259207650922]
Couler is a system designed for unified ML workflow optimization in the cloud. We integrate Large Language Models (LLMs) into workflow generation, and provide a unified programming interface for various workflow engines. Couer has successfully improved the CPU/Memory utilization by more than 15% and the workflow completion rate by around 17%.
arXiv Detail & Related papers (2024-03-12T12:47:32Z)
AutoFlow: Learning a Better Training Set for Optical Flow [62.40293188964933]
AutoFlow is a method to render training data for optical flow. AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.
arXiv Detail & Related papers (2021-04-29T17:55:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.