Enhancing Large Language Model with Self-Controlled Memory Framework
- URL: http://arxiv.org/abs/2304.13343v2
- Date: Thu, 15 Feb 2024 16:01:39 GMT
- Title: Enhancing Large Language Model with Self-Controlled Memory Framework
- Authors: Bing Wang, Xinnian Liang, Jian Yang, Hui Huang, Shuangzhi Wu, Peihao
Wu, Lu Lu, Zejun Ma, Zhoujun Li
- Abstract summary: Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.
We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
- Score: 56.38025154501917
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models (LLMs) are constrained by their inability to process
lengthy inputs, resulting in the loss of critical historical information. To
address this limitation, in this paper, we propose the Self-Controlled Memory
(SCM) framework to enhance the ability of LLMs to maintain long-term memory and
recall relevant information. Our SCM framework comprises three key components:
an LLM-based agent serving as the backbone of the framework, a memory stream
storing agent memories, and a memory controller updating memories and
determining when and how to utilize memories from memory stream. Additionally,
the proposed SCM is able to process ultra-long texts without any modification
or fine-tuning, which can integrate with any instruction following LLMs in a
plug-and-play paradigm. Furthermore, we annotate a dataset to evaluate the
effectiveness of SCM for handling lengthy inputs. The annotated dataset covers
three tasks: long-term dialogues, book summarization, and meeting
summarization. Experimental results demonstrate that our method achieves better
retrieval recall and generates more informative responses compared to
competitive baselines in long-term dialogues.
(https://github.com/wbbeyourself/SCM4LLMs)
Related papers
- Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks [42.22616978679253]
We introduce Sequence Order Recall Tasks (SORT), which we adapt from tasks used to study episodic memory in cognitive psychology.
SORT requires LLMs to recall the correct order of text segments, and provides a general framework that is both easily extendable and does not require any additional annotations.
Based on a human experiment with 155 participants, we show that humans can recall sequence order based on long-term memory of a book.
arXiv Detail & Related papers (2024-10-10T17:17:38Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular.
We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification,
Retrieval, and Synthesis in Question Answering [27.815507347725344]
This research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories.
PerLTQA features two types of memory and a benchmark of 8,593 questions for 30 characters.
We propose a novel framework for memory integration and generation, consisting of three main components: Memory Classification, Memory Retrieval, and Memory Synthesis.
arXiv Detail & Related papers (2024-02-26T04:09:53Z) - MemGPT: Towards LLMs as Operating Systems [50.02623936965231]
Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows.
We propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems.
We release MemGPT code and data for our experiments at https://memgpt.ai.
arXiv Detail & Related papers (2023-10-12T17:51:32Z) - Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.
We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z) - Augmenting Language Models with Long-Term Memory [142.04940250657637]
Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit.
We propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history.
arXiv Detail & Related papers (2023-06-12T15:13:39Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.