MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
- URL: http://arxiv.org/abs/2305.05181v2
- Date: Mon, 9 Oct 2023 02:44:12 GMT
- Title: MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
- Authors: Xiaonan Li, Xipeng Qiu
- Abstract summary: We propose a framework, Memory-of-Thought, to let the Large Language Models self-improve without annotated datasets and parameter updates.
Experimental results show that MoT can help ChatGPT significantly improve its abilities in arithmetic reasoning, commonsense reasoning, factual reasoning, and natural language inference.
- Score: 73.90376920653507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have shown impressive abilities in various
tasks. However, fundamentally improving them depends on high-quality datasets
or computationally expensive fine-tuning. On the contrary, humans can easily
improve themselves by self-thinking and memory, without external resources. In
this paper, we propose a framework, MoT, to let the LLM self-improve through
Memory-of-Thought, without annotated datasets and parameter updates.
Specifically, MoT is divided into two stages: 1. before the test stage, the LLM
pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as
external memory; 2. During the test stage, given a test question, the LLM
recalls relevant memory to help itself reason and answer it. Experimental
results show that MoT can help ChatGPT significantly improve its abilities in
arithmetic reasoning, commonsense reasoning, factual reasoning, and natural
language inference. Further analyses show that each component contributes
critically to the improvements and MoT can lead to consistent improvements
across various CoT methods and LLMs.
Related papers
- $\text{Memory}^3$: Language Modeling with Explicit Memory [22.572376536612015]
We equip large language models (LLMs) with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG)
As a preliminary proof of concept, we train from scratch a 2.4B LLM, which achieves better performance than much larger LLMs and RAG models.
We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable.
arXiv Detail & Related papers (2024-07-01T11:07:23Z) - SirLLM: Streaming Infinite Retentive LLM [74.40196814292426]
Large Language Models (LLMs) process inputs of any length and maintain a degree of memory.
Recent efforts have employed streaming inputs to alleviate the pressure of excessively long text inputs.
We introduce Streaming Infinite Retentive LLM (SirLLM), which allows LLMs to maintain longer memory during infinite-length dialogues.
arXiv Detail & Related papers (2024-05-21T06:37:03Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular.
We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Think-in-Memory: Recalling and Post-thinking Enable LLMs with Long-Term
Memory [24.464945401037056]
We propose TiM (Think-in-Memory) that enables Large Language Models to maintain an evolved memory for storing historical thoughts.
We conduct qualitative and quantitative experiments on real-world and simulated dialogues covering a wide range of topics.
arXiv Detail & Related papers (2023-11-15T06:08:35Z) - Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves [57.974103113675795]
We present a method named Rephrase and Respond' (RaR) which allows Large Language Models to rephrase and expand questions posed by humans.
RaR serves as a simple yet effective prompting method for improving performance.
We show that RaR is complementary to the popular Chain-of-Thought (CoT) methods, both theoretically and empirically.
arXiv Detail & Related papers (2023-11-07T18:43:34Z) - Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.
We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [4.997673761305335]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z) - Neural Machine Translation with Monolingual Translation Memory [58.98657907678992]
We propose a new framework that uses monolingual memory and performs learnable memory retrieval in a cross-lingual manner.
Experiments show that the proposed method obtains substantial improvements.
arXiv Detail & Related papers (2021-05-24T13:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.