Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
- URL: http://arxiv.org/abs/2305.02437v3
- Date: Sat, 23 Dec 2023 11:11:01 GMT
- Title: Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
- Authors: Xin Cheng, Di Luo, Xiuying Chen, Lemao Liu, Dongyan Zhao, Rui Yan
- Abstract summary: We propose a novel framework, selfmem, for improving retrieval-augmented generation models.
Selfmem iteratively employs a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round.
We evaluate the effectiveness of selfmem on three distinct text generation tasks.
- Score: 72.36736686941671
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With direct access to human-written reference as memory, retrieval-augmented
generation has achieved much progress in a wide range of text generation tasks.
Since better memory would typically prompt better generation~(we define this as
primal problem). The traditional approach for memory retrieval involves
selecting memory that exhibits the highest similarity to the input. However,
this method is constrained by the quality of the fixed corpus from which memory
is retrieved. In this paper, by exploring the duality of the primal problem:
better generation also prompts better memory, we propose a novel framework,
selfmem, which addresses this limitation by iteratively employing a
retrieval-augmented generator to create an unbounded memory pool and using a
memory selector to choose one output as memory for the subsequent generation
round. This enables the model to leverage its own output, referred to as
self-memory, for improved generation. We evaluate the effectiveness of selfmem
on three distinct text generation tasks: neural machine translation,
abstractive text summarization, and dialogue generation, under two generation
paradigms: fine-tuned small model and few-shot LLM. Our approach achieves
state-of-the-art results in four directions in JRC-Acquis, XSum (50.3 ROUGE-1),
and BigPatent (62.9 ROUGE-1), demonstrating the potential of self-memory in
enhancing retrieval-augmented generation models. Furthermore, we conduct
thorough analyses of each component in the selfmem framework to identify
bottlenecks and provide insights for future research.
Related papers
- $\text{Memory}^3$: Language Modeling with Explicit Memory [22.572376536612015]
We equip large language models (LLMs) with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG)
As a preliminary proof of concept, we train from scratch a 2.4B LLM, which achieves better performance than much larger LLMs and RAG models.
We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable.
arXiv Detail & Related papers (2024-07-01T11:07:23Z) - Re2G: Retrieve, Rerank, Generate [14.848179433828252]
We propose Re2G, which combines neural initial retrieval and reranking into a BART-based sequence-to-sequence generation.
To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker, and generation using only ground truth on the target sequence output.
We find incomparable gains in four diverse tasks: zero-shot slot filling, question answering, fact-checking, and dialog, with relative gains of 9% to 34% over the previous state-of-the-art on the KILT leaderboard.
arXiv Detail & Related papers (2022-07-13T15:51:40Z) - Classification and Generation of real-world data with an Associative
Memory Model [0.0]
We extend the capabilities of the basic Associative Memory Model by using a Multiple-Modality framework.
By storing both the images and labels as modalities, a single Memory can be used to retrieve and complete patterns.
arXiv Detail & Related papers (2022-07-11T12:51:27Z) - Training Language Models with Memory Augmentation [28.4608705738799]
We present a novel training approach designed for training language models with memory augmentation.
Our approach uses a training objective that directly takes in-batch examples as accessible memory.
We demonstrate significant gains over previous memory-augmented approaches.
arXiv Detail & Related papers (2022-05-25T11:37:29Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Memory-Based Semantic Parsing [79.48882899104997]
We present a memory-based model for context-dependent semantic parsing.
We learn a context memory controller that manages the memory by maintaining the cumulative meaning of sequential user utterances.
arXiv Detail & Related papers (2021-09-07T16:15:13Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - Self-Attentive Associative Memory [69.40038844695917]
We propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory)
We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks.
arXiv Detail & Related papers (2020-02-10T03:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.