Related papers: GNNs as Predictors of Agentic Workflow Performances

GNNs as Predictors of Agentic Workflow Performances

URL: http://arxiv.org/abs/2503.11301v1
Date: Fri, 14 Mar 2025 11:11:00 GMT
Title: GNNs as Predictors of Agentic Workflow Performances
Authors: Yuanshuo Zhang, Yuchen Hou, Bohan Tang, Shuo Chen, Muhan Zhang, Xiaowen Dong, Siheng Chen,
Abstract summary: Agentic invoked by Large Language Models (LLMs) have achieved remarkable success in handling complex tasks.<n>This paper formulates agentic as computational graphs and advocates Graph Neural Networks (GNNs) as efficient predictors of agentic performances.<n>We construct FLORA-Bench, a unified platform for benchmarking GNNs for predicting agentic workflow performances.
Score: 48.34485750450876
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic workflows invoked by Large Language Models (LLMs) have achieved remarkable success in handling complex tasks. However, optimizing such workflows is costly and inefficient in real-world applications due to extensive invocations of LLMs. To fill this gap, this position paper formulates agentic workflows as computational graphs and advocates Graph Neural Networks (GNNs) as efficient predictors of agentic workflow performances, avoiding repeated LLM invocations for evaluation. To empirically ground this position, we construct FLORA-Bench, a unified platform for benchmarking GNNs for predicting agentic workflow performances. With extensive experiments, we arrive at the following conclusion: GNNs are simple yet effective predictors. This conclusion supports new applications of GNNs and a novel direction towards automating agentic workflow optimization. All codes, models, and data are available at https://github.com/youngsoul0731/Flora-Bench.

Related papers

ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation [71.31634636156384]
We introduce ComfyGPT, the first self-optimizing multi-agent system designed to generate ComfyUI based on task descriptions automatically. ComfyGPT comprises four specialized agents: ReformatAgent, FlowAgent, RefineAgent, and ExecuteAgent. FlowDataset is a large-scale dataset containing 13,571 workflow-description pairs, and FlowBench is a benchmark for evaluating workflow generation systems.
arXiv Detail & Related papers (2025-03-22T06:48:50Z)
Flow: Modularized Agentic Workflow Automation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution.<n>However, the effective adjustment of agentic during execution has not been well studied.<n>In this paper, we define an activity-on-vertex (AOV) graph, which allows continuous workflow refinement by agents.<n>Our proposed multi-agent framework achieves efficient concurrent execution of subtasks, effective goal achievement, and enhanced error tolerance.
arXiv Detail & Related papers (2025-01-14T04:35:37Z)
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration. It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z)
AFlow: Automating Agentic Workflow Generation [36.61172223528231]
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains.<n>We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search.<n> Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-14T17:40:40Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.<n>We also present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.<n>We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z)
LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework [30.54068909225463]
We aim to streamline the GNN design process and leverage the advantages of Large Language Models (LLMs) to improve the performance of GNNs on downstream tasks. We formulate a new paradigm, coined "LLMs-as-Consultants," which integrates LLMs with GNNs in an interactive manner. We empirically evaluate the effectiveness of LOGIN on node classification tasks across both homophilic and heterophilic graphs.
arXiv Detail & Related papers (2024-05-22T18:17:20Z)
FlowMind: Automatic Workflow Generation with LLMs [12.848562107014093]
This paper introduces a novel approach, FlowMind, leveraging the capabilities of Large Language Models (LLMs) We propose a generic prompt recipe for a lecture that helps ground LLM reasoning with reliable Application Programming Interfaces (APIs) We also introduce NCEN-QA, a new dataset in finance for benchmarking question-answering tasks from N-CEN reports on funds.
arXiv Detail & Related papers (2024-03-17T00:36:37Z)
TEP-GNN: Accurate Execution Time Prediction of Functional Tests using Graph Neural Networks [5.899031548148629]
We propose a predictive model, dubbed TEP-GNN, which demonstrates that high-accuracy performance prediction is possible. TEP-GNN uses FA-ASTs, or flow-augmented ASTs, as a graph-based code representation approach. We evaluate TEP-GNN using four real-life Java open source programs, based on 922 test files mined from the projects' public repositories.
arXiv Detail & Related papers (2022-08-25T09:08:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.