Related papers: MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

URL: http://arxiv.org/abs/2507.02259v1
Date: Thu, 03 Jul 2025 03:11:50 GMT
Title: MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
Authors: Hongli Yu, Tinghong Chen, Jiangtao Feng, Jiangjie Chen, Weinan Dai, Qiying Yu, Ya-Qin Zhang, Wei-Ying Ma, Jingjing Liu, Mingxuan Wang, Hao Zhou,
Abstract summary: We introduce a novel agent workflow, MemAgent, which reads text in segments and updates the memory using an overwrite strategy.<n>MemAgent has demonstrated superb long-context capabilities, being able to extrapolate from an 8K context trained on 32K text to a 3.5M QA task with performance loss 5% and achieves 95%+ in 512K RULER test.
Score: 53.82053723030023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite improvements by length extrapolation, efficient attention and memory modules, handling infinitely long documents with linear complexity without performance degradation during extrapolation remains the ultimate challenge in long-text processing. We directly optimize for long-text tasks in an end-to-end fashion and introduce a novel agent workflow, MemAgent, which reads text in segments and updates the memory using an overwrite strategy. We extend the DAPO algorithm to facilitate training via independent-context multi-conversation generation. MemAgent has demonstrated superb long-context capabilities, being able to extrapolate from an 8K context trained on 32K text to a 3.5M QA task with performance loss < 5% and achieves 95%+ in 512K RULER test.

Related papers

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents [84.62985963113245]
We introduce MEM1, an end-to-end reinforcement learning framework that enables agents to operate with constant memory across long multi-turn tasks.<n>At each turn, MEM1 updates a compact shared internal state that jointly supports memory consolidation and reasoning.<n>We show that MEM1-7B improves performance by 3.5x while reducing memory usage by 3.7x compared to Qwen2.5-14B-Instruct on a 16-objective multi-hop QA task.
arXiv Detail & Related papers (2025-06-18T19:44:46Z)
Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration [11.477571238310276]
We propose a novel multi-agent framework for processing long contexts.<n>XpandA (Expand-Agent) is coupled with question-driven workflow and dynamic partitioning.<n>XpandA achieves 20% improvements and 1.5x inference speedup over baselines of full-context, RAG and previous agent-based methods.
arXiv Detail & Related papers (2025-05-27T02:05:42Z)
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation [15.975325252309554]
We introduce a novel post-training synthetic data generation strategy designed to efficiently extend the context window of Large Language Models.<n>Our approach scalably extends to arbitrarily long context lengths, unconstrained by the length of available real-world data.<n>We demonstrate that our model, with a context length of up to 1M tokens, performs well on the RULER benchmark and InfiniteBench.
arXiv Detail & Related papers (2025-04-17T04:46:57Z)
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models [54.44375226381814]
Long-context capabilities are essential for a wide range of applications, including document and video understanding, in-context learning, and inference-time scaling.<n>We introduce a efficient training recipe for building ultra-long context LLMs from aligned instruct model, pushing the boundaries of context lengths from 128K to 1M, 2M, and 4M tokens.<n>Our approach achieves state-of-the-art performance across a diverse set of long-context benchmarks.
arXiv Detail & Related papers (2025-04-08T16:58:58Z)
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models [58.08177466768262]
Long-context capabilities are essential for large language models (LLMs) to tackle complex and long-input tasks. We introduce GraphReader, a graph-based agent system designed to handle long texts by structuring them into a graph and employing an agent to explore this graph autonomously. Experimental results on the LV-Eval dataset reveal that GraphReader, using a 4k context window, consistently outperforms GPT-4-128k across context lengths from 16k to 256k by a large margin.
arXiv Detail & Related papers (2024-06-20T17:57:51Z)
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM [22.595457889113668]
This work introduces a method called Recurrent Context Compression (RCC), designed to efficiently expand the context window length of Transformer-based large language models (LLMs) We validated our approach on multiple tasks, achieving a compression rate of up to 32x on text reconstruction tasks with a BLEU4 score close to 0.95, and nearly 100% accuracy on a passkey retrieval task with a sequence length of 1M.
arXiv Detail & Related papers (2024-06-10T08:50:59Z)
XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies [45.31042312867939]
Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks but are constrained by their small context window sizes. Various efforts have been proposed to expand the context window to accommodate even up to 200K input tokens. We introduce a benchmark for extremely long context understanding with long-range dependencies, XL$2$Bench.
arXiv Detail & Related papers (2024-04-08T12:29:07Z)
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts [35.68159165639245]
We propose ReadAgent, an agent system that increases effective context length up to 20x in our experiments. Inspired by how humans interactively read long documents, we implement ReadAgent as a simple prompting system. We evaluate ReadAgent against baselines using retrieval methods, using the original long contexts, and using the gist memories.
arXiv Detail & Related papers (2024-02-15T05:40:21Z)
Training With "Paraphrasing the Original Text" Improves Long-Context Performance [19.48556587305737]
Large Language Models (LLMs) continue to evolve, more are being designed to handle long-context inputs. We propose a novel approach to design training data for long-context tasks, aiming at augmenting LLMs' proficiency in extracting key information from long context. Experimenting on LongBench and NaturalQuestions Multi-document-QA dataset with models of Llama and Qwen series, our method achieves an improvement of up to 8.48% and 4.48% in average scores.
arXiv Detail & Related papers (2023-12-18T13:40:16Z)
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading [63.93888816206071]
We introduce MemWalker, a method that processes the long context into a tree of summary nodes. Upon receiving a query, the model navigates this tree in search of relevant information, and responds once it gathers sufficient information. We show that, beyond effective reading, MemWalker enhances explainability by highlighting the reasoning steps as it interactively reads the text; pinpointing the relevant text segments related to the query.
arXiv Detail & Related papers (2023-10-08T06:18:14Z)
Effective Long-Context Scaling of Foundation Models [90.57254298730923]
We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our models achieve consistent improvements on most regular tasks and significant improvements on long-context tasks over Llama 2.
arXiv Detail & Related papers (2023-09-27T21:41:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.