Related papers: MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

URL: http://arxiv.org/abs/2308.08239v2
Date: Wed, 23 Aug 2023 03:37:04 GMT
Title: MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Authors: Junru Lu, Siyu An, Mingbao Lin, Gabriele Pergola, Yulan He, Di Yin, Xing Sun, Yunsheng Wu
Abstract summary: We propose a pipeline for refining instructions that enables large language models to effectively employ self-composed memos. We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles. The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos.
Score: 43.24092422054248
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose MemoChat, a pipeline for refining instructions that enables large language models (LLMs) to effectively employ self-composed memos for maintaining consistent long-range open-domain conversations. We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles. This requires us to carefully design tailored tuning instructions for each distinct stage. The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos, leading to enhanced consistency when participating in future conversations. We invite experts to manually annotate a test set designed to evaluate the consistency of long-range conversations questions. Experiments on three testing scenarios involving both open-source and API-accessible chatbots at scale verify the efficacy of MemoChat, which outperforms strong baselines. Our codes, data and models are available here: https://github.com/LuJunru/MemoChat.

Related papers

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [68.97819665784442]
We introduce LongMemEval, a benchmark designed to evaluate five core long-term memory abilities of chat assistants. LongMemEval presents a significant challenge to existing long-term memory systems. We present a unified framework that breaks down the long-term memory design into three stages: indexing, retrieval, and reading.
arXiv Detail & Related papers (2024-10-14T17:59:44Z)
Hidden in Plain Sight: Exploring Chat History Tampering in Interactive Language Models [12.920884182101142]
Large Language Models (LLMs) have become prevalent in real-world applications, exhibiting impressive text generation performance. To behave interactively, LLM-based chat systems must integrate prior chat history as context into their inputs, following a pre-defined structure. This paper introduces a systematic methodology to inject user-supplied history into LLM conversations without any prior knowledge of the target model.
arXiv Detail & Related papers (2024-05-30T16:36:47Z)
Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues. We equip each agent with the capability of sharing and reacting to images. The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z)
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning. Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance. We handle the task of conversation retrieval based on text summaries of the conversations. A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z)
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses. We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z)
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response. We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English. Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z)
Prompted LLMs as Chatbot Modules for Long Open-domain Conversation [7.511596831927614]
We propose MPC, a new approach for creating high-quality conversational agents without the need for fine-tuning. Our method utilizes pre-trained large language models (LLMs) as individual modules for long-term consistency and flexibility.
arXiv Detail & Related papers (2023-05-08T08:09:00Z)
Disentangling Online Chats with DAG-Structured LSTMs [55.33014148383343]
DAG-LSTMs are a generalization of Tree-LSTMs that can handle directed acyclic dependencies. We show that the novel model we propose achieves state of the art status on the task of recovering reply-to relations.
arXiv Detail & Related papers (2021-06-16T18:00:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.