Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models
- URL: http://arxiv.org/abs/2308.15022v2
- Date: Mon, 19 Feb 2024 02:39:58 GMT
- Title: Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models
- Authors: Qingyue Wang, Liang Ding, Yanan Cao, Zhiliang Tian, Shi Wang, Dacheng
Tao, Li Guo
- Abstract summary: Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.
We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
- Score: 75.98775135321355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, large language models (LLMs), such as GPT-4, stand out remarkable
conversational abilities, enabling them to engage in dynamic and contextually
relevant dialogues across a wide range of topics. However, given a long
conversation, these chatbots fail to recall past information and tend to
generate inconsistent responses. To address this, we propose to recursively
generate summaries/ memory using large language models (LLMs) to enhance
long-term memory ability. Specifically, our method first stimulates LLMs to
memorize small dialogue contexts and then recursively produce new memory using
previous memory and following contexts. Finally, the chatbot can easily
generate a highly consistent response with the help of the latest memory. We
evaluate our method on both open and closed LLMs, and the experiments on the
widely-used public dataset show that our method can generate more consistent
responses in a long-context conversation. Also, we show that our strategy could
nicely complement both long-context (e.g., 8K and 16K) and retrieval-enhanced
LLMs, bringing further long-term dialogue performance. Notably, our method is a
potential solution to enable the LLM to model the extremely long context. The
code and scripts will be released later.
Related papers
- StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses [67.92595110412094]
StreamingDialogue compresses long dialogue history into conv-attn sinks with minimal losses.
Our method outperforms strong baselines in dialogue tasks.
arXiv Detail & Related papers (2024-03-13T07:44:14Z) - Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues.
We equip each agent with the capability of sharing and reacting to images.
The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z) - MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain
Conversation [43.24092422054248]
We propose a pipeline for refining instructions that enables large language models to effectively employ self-composed memos.
We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles.
The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos.
arXiv Detail & Related papers (2023-08-16T09:15:18Z) - Augmenting Language Models with Long-Term Memory [142.04940250657637]
Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit.
We propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history.
arXiv Detail & Related papers (2023-06-12T15:13:39Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z) - Enhancing Large Language Model with Self-Controlled Memory Framework [56.38025154501917]
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.
We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
arXiv Detail & Related papers (2023-04-26T07:25:31Z) - Long Time No See! Open-Domain Conversation with Long-Term Persona Memory [37.51131984324123]
We present a novel task of Long-term Memory Conversation (LeMon)
We then build a new dialogue dataset DuLeMon and a dialogue generation framework with Long-Term Memory (LTM) mechanism.
Results on DuLeMon indicate that PLATO-LTM can significantly outperform baselines in terms of long-term dialogue consistency.
arXiv Detail & Related papers (2022-03-11T08:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.