StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
- URL: http://arxiv.org/abs/2403.11322v3
- Date: Wed, 10 Apr 2024 23:04:48 GMT
- Title: StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
- Authors: Yiran Wu, Tianwei Yue, Shaokun Zhang, Chi Wang, Qingyun Wu,
- Abstract summary: StateFlow is a novel task-solving paradigm that conceptualizes complex task-solving processes as state machines.
In StateFlow, we distinguish between "process grounding" (via state and state transitions) and "sub-task solving" (through actions within a state)
Our results show that StateFlow significantly enhances LLMs' efficiency.
- Score: 23.241520404152904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is a notable trend to use Large Language Models (LLMs) to tackle complex tasks, e.g., tasks that require a sequence of actions and dynamic interaction with tools and external environments. In this paper, we propose StateFlow, a novel LLM-based task-solving paradigm that conceptualizes complex task-solving processes as state machines. In StateFlow, we distinguish between "process grounding" (via state and state transitions) and "sub-task solving" (through actions within a state), enhancing control and interpretability of the task-solving procedure. A state represents the status of a running process. The transitions between states are controlled by heuristic rules or decisions made by the LLM, allowing for a dynamic and adaptive progression. Upon entering a state, a series of actions is executed, involving not only calling LLMs guided by different prompts, but also the utilization of external tools as needed. Our results show that StateFlow significantly enhances LLMs' efficiency. For instance, StateFlow achieves 13% and 28% higher success rates compared to ReAct in InterCode SQL and ALFWorld benchmark, with 5x and 3x less cost respectively. We also show that StateFlow can be combined with iterative refining methods like Reflexion to further improve performance.
Related papers
- FlowAgent: Achieving Compliance and Flexibility for Workflow Agents [31.088578094151178]
FlowAgent is a novel agent framework designed to maintain both compliance and flexibility.
Building on PDL, we develop a comprehensive framework that empowers LLMs to manage OOW queries effectively.
We present a new evaluation methodology to rigorously assess an LLM agent's ability to handle OOW scenarios.
arXiv Detail & Related papers (2025-02-20T07:59:31Z) - LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE)
LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients.
It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z) - Flow: Modularized Agentic Workflow Automation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution.
However, the effective adjustment of agentic during execution has not been well studied.
In this paper, we define an activity-on-vertex (AOV) graph, which allows continuous workflow refinement by agents.
Our proposed multi-agent framework achieves efficient concurrent execution of subtasks, effective goal achievement, and enhanced error tolerance.
arXiv Detail & Related papers (2025-01-14T04:35:37Z) - WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration.
It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories.
LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z) - AFlow: Automating Agentic Workflow Generation [36.61172223528231]
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains.
We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search.
Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-14T17:40:40Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching [20.20511152176522]
ActionFlow is a policy class that integrates spatial symmetry inductive biases.
On the representation level, ActionFlow introduces an SE(3) Invariant Transformer architecture.
For action generation, ActionFlow leverages Flow Matching, a state-of-the-art deep generative model.
arXiv Detail & Related papers (2024-09-06T19:30:36Z) - AutoFlow: Automated Workflow Generation for Large Language Model Agents [39.72700864347576]
Large Language Models (LLMs) have shown significant progress in understanding complex natural language.
To make sure LLM Agents follow an effective and reliable procedure to solve the given task, manually designed are usually used.
We propose AutoFlow, a framework designed to automatically generate for agents to solve complex tasks.
arXiv Detail & Related papers (2024-07-01T21:05:02Z) - FlowMind: Automatic Workflow Generation with LLMs [12.848562107014093]
This paper introduces a novel approach, FlowMind, leveraging the capabilities of Large Language Models (LLMs)
We propose a generic prompt recipe for a lecture that helps ground LLM reasoning with reliable Application Programming Interfaces (APIs)
We also introduce NCEN-QA, a new dataset in finance for benchmarking question-answering tasks from N-CEN reports on funds.
arXiv Detail & Related papers (2024-03-17T00:36:37Z) - SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional
Videos [54.01116513202433]
We study the problem of procedure planning in instructional videos, which aims to make a goal-oriented sequence of action steps given partial visual state observations.
Recent works succeeded in sequence modeling of steps with only sequence-level annotations accessible during training, which overlooked the roles of states in the procedures.
We aim to establish a more structured state space by investigating the causal relations between steps and states in procedures.
arXiv Detail & Related papers (2024-03-03T19:53:06Z) - TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation.
Specifically, task decomposition, tool selection, and parameter prediction are assessed.
Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z) - Plan, Eliminate, and Track -- Language Models are Good Teachers for
Embodied Agents [99.17668730578586]
Pre-trained large language models (LLMs) capture procedural knowledge about the world.
Plan, Eliminate, and Track (PET) framework translates a task description into a list of high-level sub-tasks.
PET framework leads to a significant 15% improvement over SOTA for generalization to human goal specifications.
arXiv Detail & Related papers (2023-05-03T20:11:22Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.