Related papers: WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents

WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents

URL: http://arxiv.org/abs/2601.08158v1
Date: Tue, 13 Jan 2026 02:43:41 GMT
Title: WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents
Authors: Yuqing Zhou, Zhuoer Wang, Jie Yuan, Hong Wang, Samson Koelle, Ziwei Zhu, Wei Niu,
Abstract summary: Large language model (LLM)-based agents are widely deployed in user-facing services but remain error-prone in new tasks.<n>We propose WISE-Flow, a feasibility-centric framework that converts historical service interactions into reusable procedural experience.
Score: 12.014029662322152
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM)-based agents are widely deployed in user-facing services but remain error-prone in new tasks, tend to repeat the same failure patterns, and show substantial run-to-run variability. Fixing failures via environment-specific training or manual patching is costly and hard to scale. To enable self-evolving agents in user-facing service environments, we propose WISE-Flow, a workflow-centric framework that converts historical service interactions into reusable procedural experience by inducing workflows with prerequisite-augmented action blocks. At deployment, WISE-Flow aligns the agent's execution trajectory to retrieved workflows and performs prerequisite-aware feasibility reasoning to achieve state-grounded next actions. Experiments on ToolSandbox and $τ^2$-bench show consistent improvement across base models.

Related papers

MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks [95.86122998005612]
MiroFlow is an open-source agent framework for large language models (LLMs)<n>It incorporates an agent graph for flexible orchestration, an optional deep reasoning mode to enhance performance, and a robust execution to ensure stable and reproducible performance.<n>It consistently achieves state-of-the-art performance across multiple agent benchmarks, including GAIA, BrowseComp-EN/ZH, HLE, xBench-DeepSearch, and FutureX.
arXiv Detail & Related papers (2026-02-26T09:45:04Z)
FlowMind: Execute-Summarize for Structured Workflow Generation from LLM Reasoning [5.153212048436295]
LLMs can solve complex tasks through reasoning and tool use, but accurately translating these solutions into structured remains challenging.<n>We model as sequences of tool use and reformulate the problem as designing a mechanism that can both solve tasks and reliably construct them.<n>We propose an Execute-Summarize(ES) framework that decouples task execution from workflow construction.
arXiv Detail & Related papers (2026-02-12T10:04:42Z)
ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Intrinsic Adaptation [60.25542764389203]
Agentic systems powered by Large Language Models (LLMs) have demonstrated remarkable potential in tackling complex, long-horizon tasks.<n>Existing approaches, relying on manual orchestration or runtime-based patches, often struggle with poor generalization and fragmented optimization.<n>We propose ToolSelf, a novel paradigm enabling tool-driven self-readjustment.
arXiv Detail & Related papers (2026-02-08T09:27:18Z)
FlowSteer: Interactive Agentic Workflow Orchestration via End-to-End Reinforcement Learning [49.369614288007334]
FlowSteer is an end-to-end reinforcement learning framework that takes a lightweight policy model as the agent and an executable canvas environment.<n>We show that FlowSteer significantly outperforms baselines across various tasks.
arXiv Detail & Related papers (2026-02-02T05:30:42Z)
DyFlow: Dynamic Workflow Framework for Agentic Reasoning [79.19799197382478]
DyFlow is a dynamic workflow generation framework that adaptively constructs and adjusts reasoning procedures based on task requirements and real-time intermediate feedback.<n>We systematically evaluate DyFlow across diverse domains, including social reasoning, biomedical tasks, mathematical problem solving, and code generation.<n>Results demonstrate that DyFlow significantly outperforms existing baselines, achieving substantial Pass@k improvements and exhibiting robust generalization across diverse domains.
arXiv Detail & Related papers (2025-09-30T10:36:23Z)
Agent WARPP: Workflow Adherence via Runtime Parallel Personalization [0.0]
Large language models (LLMs) are increasingly applied in task-oriented dialogue (TOD) systems.<n>We present Adherence via Parallel Personalization, or WARPP, a training-free, modular framework that combines multi-agent runtime with orchestration.<n>By dynamically pruning conditional branches based on user attributes, the framework reduces reasoning overhead and narrows tool selection at runtime.
arXiv Detail & Related papers (2025-07-23T23:27:49Z)
Flow: Modularized Agentic Workflow Automation [53.073598156915615]
Multi-agent frameworks powered by large language models (LLMs) have demonstrated great success in automated planning and task execution.<n>However, the effective adjustment of agentic during execution has not been well studied.<n>In this paper, we define an activity-on-vertex (AOV) graph, which allows continuous workflow refinement by agents.<n>Our proposed multi-agent framework achieves efficient concurrent execution of subtasks, effective goal achievement, and enhanced error tolerance.
arXiv Detail & Related papers (2025-01-14T04:35:37Z)
Action Engine: Automatic Workflow Generation in FaaS [1.4185188982404757]
Action Engine makes use of toolaugmented large language models (LLMs) at its kernel to interpret human language queries.<n>Action Engine seamlessly manages the data dependency between them, ensuring the developer's query is processed and resolved.
arXiv Detail & Related papers (2024-11-29T05:54:41Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.<n>We also present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.<n>We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.