StackPlanner: A Centralized Hierarchical Multi-Agent System with Task-Experience Memory Management
- URL: http://arxiv.org/abs/2601.05890v1
- Date: Fri, 09 Jan 2026 16:09:48 GMT
- Title: StackPlanner: A Centralized Hierarchical Multi-Agent System with Task-Experience Memory Management
- Authors: Ruizhe Zhang, Xinke Jiang, Zhibang Yang, Zhixin Zhang, Jiaran Gao, Yuzhen Xiao, Hongbin Lai, Xu Chu, Junfeng Zhao, Yasha Wang,
- Abstract summary: central agents often suffer from unstable long-horizon collaboration due to the lack of memory management.<n>We propose StackPlanner, a hierarchical multi-agent framework with explicit memory control.<n> Experiments on multiple deep-search and agent system benchmarks demonstrate the effectiveness of our approach.
- Score: 25.50119360269554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent systems based on large language models, particularly centralized architectures, have recently shown strong potential for complex and knowledge-intensive tasks. However, central agents often suffer from unstable long-horizon collaboration due to the lack of memory management, leading to context bloat, error accumulation, and poor cross-task generalization. To address both task-level memory inefficiency and the inability to reuse coordination experience, we propose StackPlanner, a hierarchical multi-agent framework with explicit memory control. StackPlanner addresses these challenges by decoupling high-level coordination from subtask execution with active task-level memory control, and by learning to retrieve and exploit reusable coordination experience via structured experience memory and reinforcement learning. Experiments on multiple deep-search and agent system benchmarks demonstrate the effectiveness of our approach in enabling reliable long-horizon multi-agent collaboration.
Related papers
- IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents [4.655926959889001]
We present IntentCUA, a computer-use framework designed to stabilize long-horizon execution through intent-aligned plan memory.<n>Int Intent prototypes retrieve subgroup-aligned skills and inject them into partial plans, reducing redundant re-planning.<n>Int IntentCUA achieved a 74.83% task success rate with a Step Efficiency Ratio of 0.91, outperforming RL-based and trajectory-centric baselines.
arXiv Detail & Related papers (2026-02-19T03:42:15Z) - Learning to Share: Selective Memory for Efficient Parallel Agentic Systems [49.78267008828593]
Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results.<n>Recent approaches deploy multiple agent teams running in parallel to explore diverse reasoning trajectories.<n>We propose Learning to Share (LTS), a learned shared-memory mechanism for parallel agentic frameworks.
arXiv Detail & Related papers (2026-02-05T18:20:21Z) - AMA: Adaptive Memory via Multi-Agent Collaboration [54.490349689939166]
We propose Adaptive Memory via Multi-Agent Collaboration (AMA), a novel framework that leverages coordinated agents to manage memory across multiple granularities.<n>AMA significantly outperforms state-of-the-art baselines while reducing token consumption by approximately 80% compared to full-context methods.
arXiv Detail & Related papers (2026-01-28T08:09:49Z) - TALM: Dynamic Tree-Structured Multi-Agent Framework with Long-Term Memory for Scalable Code Generation [0.0]
Agentic code generation requires large language models capable of complex context management and multi-step reasoning.<n>We propose TALM, a dynamic framework that integrates structured task decomposition, localized re-reasoning, and long-term memory mechanisms.<n> Experimental results on HumanEval, BigCodeBench, and ClassEval benchmarks demonstrate that TALM consistently delivers strong reasoning performance and high token efficiency.
arXiv Detail & Related papers (2025-10-27T05:07:36Z) - ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory [57.517214479414726]
ReasoningBank is a memory framework that distills generalizable reasoning strategies from an agent's self-judged successful and failed experiences.<n>At test time, an agent retrieves relevant memories from ReasoningBank to inform its interaction and then integrates new learnings back, enabling it to become more capable over time.<n>We introduce memory-aware test-time scaling (MaTTS), which accelerates and diversifies this learning process by scaling up the agent's interaction experience.
arXiv Detail & Related papers (2025-09-29T17:51:03Z) - H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents [3.9054156855794973]
Large language model (LLM)-based agents have shown strong potential in multi-task scenarios.<n>Existing approaches often treat prior experiences and knowledge as monolithic units, leading to inefficient and coarse-grained knowledge transfer.<n>We propose a novel hierarchical memory architecture that enables fine-grained knowledge transfer.
arXiv Detail & Related papers (2025-09-16T08:30:08Z) - GoalfyMax: A Protocol-Driven Multi-Agent System for Intelligent Experience Entities [8.508965426627887]
We present GoalfyMax, a protocol-driven framework for end-to-end multi-agent collaboration.<n>GoalfyMax introduces a standardized Agent-to-Agent (A2A) communication layer built on the Model Context Protocol (MCP)<n>It incorporates the Experience Pack (XP) architecture, a layered memory system that preserves both task rationales and execution traces.
arXiv Detail & Related papers (2025-07-13T05:13:52Z) - HiRA: A Hierarchical Reasoning Framework for Decoupled Planning and Execution in Deep Search [85.12447821237045]
HiRA is a hierarchical framework that separates strategic planning from specialized execution.<n>Our approach decomposes complex search tasks into focused subtasks, assigns each subtask to domain-specific agents equipped with external tools and reasoning capabilities.<n> Experiments on four complex, cross-modal deep search benchmarks demonstrate that HiRA significantly outperforms state-of-the-art RAG and agent-based systems.
arXiv Detail & Related papers (2025-07-03T14:18:08Z) - FindingDory: A Benchmark to Evaluate Memory in Embodied Agents [49.18498389833308]
We introduce a new benchmark for long-range embodied tasks in the Habitat simulator.<n>This benchmark evaluates memory-based capabilities across 60 tasks requiring sustained engagement and contextual awareness.
arXiv Detail & Related papers (2025-06-18T17:06:28Z) - Variational Offline Multi-agent Skill Discovery [47.924414207796005]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills.<n>Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining.<n> Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing hierarchical multi-agent reinforcement learning (MARL) methods.
arXiv Detail & Related papers (2024-05-26T00:24:46Z) - Learning Task Decomposition with Ordered Memory Policy Network [73.3813423684999]
We propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration.
OMPN can be applied to partially observable environments and still achieve higher task decomposition performance.
Our visualization confirms that the subtask hierarchy can emerge in our model.
arXiv Detail & Related papers (2021-03-19T18:13:35Z) - Learning Functionally Decomposed Hierarchies for Continuous Control
Tasks with Path Planning [36.050432925402845]
We present HiDe, a novel hierarchical reinforcement learning architecture that successfully solves long horizon control tasks.
We experimentally show that our method generalizes across unseen test environments and can scale to 3x horizon length compared to both learning and non-learning based methods.
arXiv Detail & Related papers (2020-02-14T10:19:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.