Related papers: PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Synthesis in Question Answering

PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Synthesis in Question Answering

URL: http://arxiv.org/abs/2402.16288v1
Date: Mon, 26 Feb 2024 04:09:53 GMT
Title: PerLTQA: A Personal Long-Term Memory Dataset for Memory Classification, Retrieval, and Synthesis in Question Answering
Authors: Yiming Du, Hongru Wang, Zhengyi Zhao, Bin Liang, Baojun Wang, Wanjun Zhong, Zezhong Wang, Kam-Fai Wong
Abstract summary: This research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories. PerLTQA features two types of memory and a benchmark of 8,593 questions for 30 characters. We propose a novel framework for memory integration and generation, consisting of three main components: Memory Classification, Memory Retrieval, and Memory Synthesis.
Score: 27.815507347725344
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Long-term memory plays a critical role in personal interaction, considering long-term memory can better leverage world knowledge, historical information, and preferences in dialogues. Our research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories, including world knowledge, profiles, social relationships, events, and dialogues. This dataset is collected to investigate the use of personalized memories, focusing on social interactions and events in the QA task. PerLTQA features two types of memory and a comprehensive benchmark of 8,593 questions for 30 characters, facilitating the exploration and application of personalized memories in Large Language Models (LLMs). Based on PerLTQA, we propose a novel framework for memory integration and generation, consisting of three main components: Memory Classification, Memory Retrieval, and Memory Synthesis. We evaluate this framework using five LLMs and three retrievers. Experimental results demonstrate that BERT-based classification models significantly outperform LLMs such as ChatGLM3 and ChatGPT in the memory classification task. Furthermore, our study highlights the importance of effective memory integration in the QA task.

Related papers

Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions [19.51727855436013]
We term agents with memory mechanisms as memory agents.<n>In this paper, we identify four core competencies essential for memory agents: accurate retrieval, test-time learning, long-range understanding, and conflict resolution.<n>Existing datasets either rely on limited context lengths or are tailored for static, long-context settings like book-based QA.<n>No existing benchmarks cover all four competencies. Therefore, we introduce MemoryAgentBench, a new benchmark specifically designed for memory agents.
arXiv Detail & Related papers (2025-07-07T17:59:54Z)
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents [26.647812147336538]
We construct a more comprehensive dataset and benchmark to evaluate the memory capability of LLM-based agents.<n>Our dataset incorporates factual memory and reflective memory as different levels, and proposes participation and observation as various interactive scenarios.<n>Based on our dataset, we present a benchmark, named MemBench, to evaluate the memory capability of LLM-based agents from multiple aspects, including their effectiveness, efficiency, and capacity.
arXiv Detail & Related papers (2025-06-20T10:09:23Z)
Towards Multi-Granularity Memory Association and Selection for Long-Term Conversational Agents [73.77930932005354]
We propose MemGAS, a framework that enhances memory consolidation by constructing multi-granularity association, adaptive selection, and retrieval.<n>MemGAS is based on multi-granularity memory units and employs Gaussian Mixture Models to cluster and associate new memories with historical ones.<n>Experiments on four long-term memory benchmarks demonstrate that MemGAS outperforms state-of-the-art methods on both question answer and retrieval tasks.
arXiv Detail & Related papers (2025-05-26T06:13:07Z)
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions [55.19217798774033]
Memory is a fundamental component of AI systems, underpinning large language models (LLMs) based agents. We introduce six fundamental memory operations: Consolidation, Updating, Indexing, Forgetting, Retrieval, and Compression. This survey provides a structured and dynamic perspective on research, benchmark datasets, and tools related to memory in AI.
arXiv Detail & Related papers (2025-05-01T17:31:33Z)
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs [34.361000444808454]
Memory is the process of encoding, storing, and retrieving information. In the era of large language models (LLMs), memory refers to the ability of an AI system to retain, recall, and use information from past interactions to improve future responses and interactions.
arXiv Detail & Related papers (2025-04-22T15:05:04Z)
Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning [41.94295877935867]
We introduce MIKASA (Memory-Intensive Skills Assessment Suite for Agents), a comprehensive benchmark for memory RL. We also develop MIKASA-Robo, a benchmark of 32 carefully designed memory-intensive tasks that assess memory capabilities in tabletop robotic manipulation. Our contributions establish a unified framework for advancing memory RL research, driving the development of more reliable systems for real-world applications.
arXiv Detail & Related papers (2025-02-14T20:46:19Z)
On the Structural Memory of LLM Agents [20.529239764968654]
Memory plays a pivotal role in enabling large language model(LLM)-based agents to engage in complex and long-term interactions. This paper investigates how memory structures and memory retrieval methods affect the performance of LLM-based agents.
arXiv Detail & Related papers (2024-12-17T04:30:00Z)
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [68.97819665784442]
This paper introduces LongMemEval, a benchmark designed to evaluate five core long-term memory abilities of chat assistants. LongMemEval presents a significant challenge to existing long-term memory systems. We present a unified framework that breaks down the long-term memory design into four design choices.
arXiv Detail & Related papers (2024-10-14T17:59:44Z)
Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks [42.22616978679253]
We introduce Sequence Order Recall Tasks (SORT), which we adapt from tasks used to study episodic memory in cognitive psychology. SORT requires LLMs to recall the correct order of text segments, and provides a general framework that is both easily extendable and does not require any additional annotations. Based on a human experiment with 155 participants, we show that humans can recall sequence order based on long-term memory of a book.
arXiv Detail & Related papers (2024-10-10T17:17:38Z)
Empowering Working Memory for Large Language Model Agents [9.83467478231344]
This paper explores the potential of applying cognitive psychology's working memory frameworks to large language models (LLMs) An innovative model is proposed incorporating a centralized Working Memory Hub and Episodic Buffer access to retain memories across episodes. This architecture aims to provide greater continuity for nuanced contextual reasoning during intricate tasks and collaborative scenarios.
arXiv Detail & Related papers (2023-12-22T05:59:00Z)
A Framework for Inference Inspired by Human Memory Mechanisms [9.408704431898279]
We propose a PMI framework that consists of perception, memory and inference components. The memory module comprises working and long-term memory, with the latter endowed with a higher-order structure to retain extensive and complex relational knowledge and experience. We apply our PMI to improve prevailing Transformers and CNN models on question-answering tasks like bAbI-20k and Sort-of-CLEVR datasets.
arXiv Detail & Related papers (2023-10-01T08:12:55Z)
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses. We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z)
UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning [15.313416157905685]
We propose a Unified framework for Long-term Memory Conversations (UniMC) We decompose the main task into three subtasks based on probability graphs. Each subtask involves learning a representation for calculating the relevance between the query and memory.
arXiv Detail & Related papers (2023-06-18T12:30:50Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
Enhancing Large Language Model with Self-Controlled Memory Framework [56.38025154501917]
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information. We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
arXiv Detail & Related papers (2023-04-26T07:25:31Z)
MeMOT: Multi-Object Tracking with Memory [97.48960039220823]
Our model, called MeMOT, consists of three main modules that are all Transformer-based. MeMOT observes very competitive performance on widely adopted MOT datasets.
arXiv Detail & Related papers (2022-03-31T02:33:20Z)
Self-Attentive Associative Memory [69.40038844695917]
We propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory) We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks.
arXiv Detail & Related papers (2020-02-10T03:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.