MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain
Conversation
- URL: http://arxiv.org/abs/2308.08239v2
- Date: Wed, 23 Aug 2023 03:37:04 GMT
- Title: MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain
Conversation
- Authors: Junru Lu, Siyu An, Mingbao Lin, Gabriele Pergola, Yulan He, Di Yin,
Xing Sun, Yunsheng Wu
- Abstract summary: We propose a pipeline for refining instructions that enables large language models to effectively employ self-composed memos.
We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles.
The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos.
- Score: 43.24092422054248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose MemoChat, a pipeline for refining instructions that enables large
language models (LLMs) to effectively employ self-composed memos for
maintaining consistent long-range open-domain conversations. We demonstrate a
long-range open-domain conversation through iterative
"memorization-retrieval-response" cycles. This requires us to carefully design
tailored tuning instructions for each distinct stage. The instructions are
reconstructed from a collection of public datasets to teach the LLMs to
memorize and retrieve past dialogues with structured memos, leading to enhanced
consistency when participating in future conversations. We invite experts to
manually annotate a test set designed to evaluate the consistency of long-range
conversations questions. Experiments on three testing scenarios involving both
open-source and API-accessible chatbots at scale verify the efficacy of
MemoChat, which outperforms strong baselines. Our codes, data and models are
available here: https://github.com/LuJunru/MemoChat.
Related papers
- Hidden in Plain Sight: Exploring Chat History Tampering in Interactive Language Models [12.920884182101142]
Large Language Models (LLMs) have become prevalent in real-world applications, exhibiting impressive text generation performance.
To behave interactively, LLM-based chat systems must integrate prior chat history as context into their inputs, following a pre-defined structure.
This paper introduces a systematic methodology to inject user-supplied history into LLM conversations without any prior knowledge of the target model.
arXiv Detail & Related papers (2024-05-30T16:36:47Z) - Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues.
We equip each agent with the capability of sharing and reacting to images.
The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.
We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - Prompted LLMs as Chatbot Modules for Long Open-domain Conversation [7.511596831927614]
We propose MPC, a new approach for creating high-quality conversational agents without the need for fine-tuning.
Our method utilizes pre-trained large language models (LLMs) as individual modules for long-term consistency and flexibility.
arXiv Detail & Related papers (2023-05-08T08:09:00Z) - Disentangling Online Chats with DAG-Structured LSTMs [55.33014148383343]
DAG-LSTMs are a generalization of Tree-LSTMs that can handle directed acyclic dependencies.
We show that the novel model we propose achieves state of the art status on the task of recovering reply-to relations.
arXiv Detail & Related papers (2021-06-16T18:00:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.