Related papers: Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

URL: http://arxiv.org/abs/2308.15022v4
Date: Mon, 25 Aug 2025 14:43:13 GMT
Title: Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Authors: Qingyue Wang, Yanhe Fu, Yanan Cao, Shuai Wang, Zhiliang Tian, Liang Ding,
Abstract summary: Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.<n>We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
Score: 30.48902594738911
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, large language models (LLMs), such as GPT-4, stand out remarkable conversational abilities, enabling them to engage in dynamic and contextually relevant dialogues across a wide range of topics. However, given a long conversation, these chatbots fail to recall past information and tend to generate inconsistent responses. To address this, we propose to recursively generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability. Specifically, our method first stimulates LLMs to memorize small dialogue contexts and then recursively produce new memory using previous memory and following contexts. Finally, the chatbot can easily generate a highly consistent response with the help of the latest memory. We evaluate our method on both open and closed LLMs, and the experiments on the widely-used public dataset show that our method can generate more consistent responses in a long-context conversation. Also, we show that our strategy could nicely complement both long-context (e.g., 8K and 16K) and retrieval-enhanced LLMs, bringing further long-term dialogue performance. Notably, our method is a potential solution to enable the LLM to model the extremely long context. The code and scripts are released.

Related papers

Evaluating Long-Term Memory for Long-Context Question Answering [100.1267054069757]
We present a systematic evaluation of memory-augmented methods using LoCoMo, a benchmark of synthetic long-context dialogues annotated for question-answering tasks.<n>Our findings show that memory-augmented approaches reduce token usage by over 90% while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-10-27T18:03:50Z)
SGMem: Sentence Graph Memory for Long-Term Conversational Agents [14.89396085814917]
We introduce SGMem (Sentence Graph Memory), which represents dialogue as sentence-level graphs within chunked units.<n>We show that SGMem consistently improves accuracy and outperforms strong baselines in long-term conversational question answering.
arXiv Detail & Related papers (2025-09-25T14:21:44Z)
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [68.97819665784442]
We introduce LongMemEval, a benchmark designed to evaluate five core long-term memory abilities of chat assistants. LongMemEval presents a significant challenge to existing long-term memory systems. We present a unified framework that breaks down the long-term memory design into three stages: indexing, retrieval, and reading.
arXiv Detail & Related papers (2024-10-14T17:59:44Z)
StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses [67.92595110412094]
StreamingDialogue compresses long dialogue history into conv-attn sinks with minimal losses. Our method outperforms strong baselines in dialogue tasks.
arXiv Detail & Related papers (2024-03-13T07:44:14Z)
Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues. We equip each agent with the capability of sharing and reacting to images. The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z)
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation [43.24092422054248]
We propose a pipeline for refining instructions that enables large language models to effectively employ self-composed memos. We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles. The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos.
arXiv Detail & Related papers (2023-08-16T09:15:18Z)
Augmenting Language Models with Long-Term Memory [142.04940250657637]
Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit. We propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history.
arXiv Detail & Related papers (2023-06-12T15:13:39Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
Enhancing Large Language Model with Self-Controlled Memory Framework [56.38025154501917]
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information. We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
arXiv Detail & Related papers (2023-04-26T07:25:31Z)
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory [37.51131984324123]
We present a novel task of Long-term Memory Conversation (LeMon) We then build a new dialogue dataset DuLeMon and a dialogue generation framework with Long-Term Memory (LTM) mechanism. Results on DuLeMon indicate that PLATO-LTM can significantly outperform baselines in terms of long-term dialogue consistency.
arXiv Detail & Related papers (2022-03-11T08:41:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.