LSTM-MAS: A Long Short-Term Memory Inspired Multi-Agent System for Long-Context Understanding
- URL: http://arxiv.org/abs/2601.11913v1
- Date: Sat, 17 Jan 2026 05:16:23 GMT
- Title: LSTM-MAS: A Long Short-Term Memory Inspired Multi-Agent System for Long-Context Understanding
- Authors: Yichen Jiang, Peng Ye, Jiakang Yuan, Chongjun Tu, Lei Bai, Tao Chen,
- Abstract summary: Long language models (LLMs) are difficult to process because of the accumulation of errors and the propagation of hallucinations.<n>We design a Multi-Agent System called LSTM-MAS, emulating LSTM's hierarchical information flow and gated memory mechanisms for long-context understanding.<n>Our model achieves improvements of 40.93%, 43.70%,121.57% and 33.12%, on NarrativeQA, Qasper, HotpotQA, and MuSiQue, respectively.
- Score: 24.027208865014064
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Effectively processing long contexts remains a fundamental yet unsolved challenge for large language models (LLMs). Existing single-LLM-based methods primarily reduce the context window or optimize the attention mechanism, but they often encounter additional computational costs or constrained expanded context length. While multi-agent-based frameworks can mitigate these limitations, they remain susceptible to the accumulation of errors and the propagation of hallucinations. In this work, we draw inspiration from the Long Short-Term Memory (LSTM) architecture to design a Multi-Agent System called LSTM-MAS, emulating LSTM's hierarchical information flow and gated memory mechanisms for long-context understanding. Specifically, LSTM-MAS organizes agents in a chained architecture, where each node comprises a worker agent for segment-level comprehension, a filter agent for redundancy reduction, a judge agent for continuous error detection, and a manager agent for globally regulates information propagation and retention, analogous to LSTM and its input gate, forget gate, constant error carousel unit, and output gate. These novel designs enable controlled information transfer and selective long-term dependency modeling across textual segments, which can effectively avoid error accumulation and hallucination propagation. We conducted an extensive evaluation of our method. Compared with the previous best multi-agent approach, CoA, our model achieves improvements of 40.93%, 43.70%,121.57% and 33.12%, on NarrativeQA, Qasper, HotpotQA, and MuSiQue, respectively.
Related papers
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension [69.24689919827817]
modelname is a novel framework based on multi-grained context compression and query-aware information acquisition.<n>modelnameachieves performance superior or comparable to strong baselines.
arXiv Detail & Related papers (2026-03-05T03:16:16Z) - Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation [50.22481337087162]
Referring Video Object (RVOS) aims to segment objects in videos based on textual queries.<n>Refer-Agent is a collaborative multi-agent system with alternating reasoning-reflection mechanisms.
arXiv Detail & Related papers (2026-02-03T14:48:12Z) - Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z) - QL-LSTM: A Parameter-Efficient LSTM for Stable Long-Sequence Modeling [0.0]
This paper introduces the Quantum-Leap LSTM (QL-LSTM), a recurrent architecture designed to address both challenges through two independent components.<n>We evaluate QL-LSTM on sentiment classification using the IMDB dataset with extended document lengths, comparing it to LSTM, GRU, and BiLSTM reference models.<n>Although the PSUG and HGR-ASC components are more efficient per time step, the current prototype remains limited by the inherent sequential nature of recurrent models and therefore does not yet yield wall-clock speed improvements without further kernel-level optimization.
arXiv Detail & Related papers (2025-12-06T22:29:19Z) - A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models [85.30893355216486]
We study how visual token redundancy evolves with different dMLLM architectures and tasks.<n>Our study reveals that visual redundancy emerges only in from-scratch dMLLMs while handling long-answer tasks.<n>Layer-skipping is promising for accelerating AR-to-diffusion dMLLMs, whereas progressive or late-step pruning is more effective for from-scratch dMLLMs.
arXiv Detail & Related papers (2025-11-19T04:13:36Z) - URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding [55.45331924836242]
We present URaG, a framework that Unifies Retrieval and Generation within a single MLLM.<n>We show that URaG achieves state-of-the-art performance while reducing computational overhead by 44-56%.
arXiv Detail & Related papers (2025-11-13T17:54:09Z) - Intrinsic Memory Agents: Heterogeneous Multi-Agent LLM Systems through Structured Contextual Memory [3.8482387279540555]
Multi-agent systems built on Large Language Models (LLMs) show exceptional promise for complex collaborative problem-solving.<n>Yet they face fundamental challenges stemming from context window limitations that impair memory consistency, role adherence, and procedural integrity.<n>This paper introduces Intrinsic Memory Agents, a novel framework that addresses these limitations through structured agent-specific memories.
arXiv Detail & Related papers (2025-08-12T15:05:00Z) - AF-MAT: Aspect-aware Flip-and-Fuse xLSTM for Aspect-based Sentiment Analysis [0.6498237940960344]
We introduce Aspect-aware Flip-and-Fuse xLSTM (AF-MAT), a framework that leverages xLSTM's strengths.<n> AF-MAT features an Aspect-aware matrix LSTM mechanism that introduces a dedicated aspect gate, enabling the model to selectively emphasize tokens semantically relevant to the target aspect during memory updates.<n>Experiments on three benchmark datasets that AF-MAT outperforms state-of-the-art baselines, achieving higher accuracy in ABSA tasks.
arXiv Detail & Related papers (2025-07-01T22:21:33Z) - LTL Verification of Memoryful Neural Agents [16.353043979615496]
We present a framework for verifying Memoryful Neural Multi-Agent Systems (MN-MAS) against full Linear Temporal Logic (LTL) specifications.<n>Examples of MN-MAS include multi-agent systems based on feed-forward and recurrent neural networks or state-space models.
arXiv Detail & Related papers (2025-03-04T11:20:19Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Efficient Heterogeneous Large Language Model Decoding with Model-Attention Disaggregation [15.35494431928751]
Transformer-based large language models (LLMs) exhibit impressive performance in generative tasks but also introduce significant challenges in real-world serving.<n>We introduce model-attention disaggregation to enhance the efficiency of LLM decoding.<n>We develop and deploy Lamina, an LLM inference system that incorporates model-attention disaggregation in a distributed heterogeneous cluster.
arXiv Detail & Related papers (2024-05-03T02:15:15Z) - Multi-Agent Reinforcement Learning for Microprocessor Design Space
Exploration [71.95914457415624]
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency.
We propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem.
Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines.
arXiv Detail & Related papers (2022-11-29T17:10:24Z) - Working Memory Connections for LSTM [51.742526187978726]
We show that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks.
Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.
arXiv Detail & Related papers (2021-08-31T18:01:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.