Related papers: An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks

An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks

URL: http://arxiv.org/abs/2210.16773v1
Date: Sun, 30 Oct 2022 08:34:49 GMT
Title: An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
Authors: Yuxiang Wu, Yu Zhao, Baotian Hu, Pasquale Minervini, Pontus Stenetorp and Sebastian Riedel
Abstract summary: Parametric and retrieval-augmented models have complementary strengths in terms of computational efficiency and predictive accuracy. We propose the Efficient Memory-Augmented Transformer (EMAT) It encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying.
Score: 40.81306982129298
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Access to external knowledge is essential for many natural language processing tasks, such as question answering and dialogue. Existing methods often rely on a parametric model that stores knowledge in its parameters, or use a retrieval-augmented model that has access to an external knowledge source. Parametric and retrieval-augmented models have complementary strengths in terms of computational efficiency and predictive accuracy. To combine the strength of both approaches, we propose the Efficient Memory-Augmented Transformer (EMAT) -- it encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying. We also introduce pre-training tasks that allow EMAT to encode informative key-value representations, and to learn an implicit strategy to integrate multiple memory slots into the transformer. Experiments on various knowledge-intensive tasks such as question answering and dialogue datasets show that, simply augmenting parametric models (T5-base) using our method produces more accurate results (e.g., 25.8 -> 44.3 EM on NQ) while retaining a high throughput (e.g., 1000 queries/s on NQ). Compared to retrieval-augmented models, EMAT runs substantially faster across the board and produces more accurate results on WoW and ELI5. Our code and datasets are available at https://github. com/uclnlp/EMAT.

Related papers

Parametric Retrieval Augmented Generation [32.29608109539912]
Parametric RAG is a new RAG paradigm that integrates external knowledge directly into the parameters of feed-forward networks. It substantially enhances both the effectiveness and efficiency of knowledge augmentation in large language models.
arXiv Detail & Related papers (2025-01-27T10:04:49Z)
MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources [12.783393023641505]
We introduce an efficient memory-augmented transformer called MATTER. MATTER retrieves and reads from both unstructured sources (paragraphs) and semi-structured sources (QA pairs) in the form of fixed-length neural memories. We demonstrate that our model outperforms existing efficient retrieval-augmented models on popular QA benchmarks in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-06-07T06:35:37Z)
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks [5.5127111704068374]
This work addresses flexibility in deep learning by means of transductive reasoning. We propose PARMESAN, a scalable method which leverages a memory module for solving dense prediction tasks. Our method is compatible with commonly used architectures and canonically transfers to 1D, 2D, and 3D grid-based data.
arXiv Detail & Related papers (2024-03-18T12:55:40Z)
In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots. ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z)
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers [140.0479479231558]
In this work, we aim to unify a variety of pre-training tasks into a multi-task pre-trained model, namely MASTER. MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors.
arXiv Detail & Related papers (2022-12-15T13:57:07Z)
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models [58.42146641102329]
We develop a novel semi-parametric language model architecture, Knowledge-in-Context (KiC) KiC empowers a parametric text-to-text language model with a knowledge-rich external memory. As a knowledge-rich semi-parametric language model, KiC only needs a much smaller part to achieve superior zero-shot performance on unseen tasks.
arXiv Detail & Related papers (2022-10-28T23:18:43Z)
Re2G: Retrieve, Rerank, Generate [14.848179433828252]
We propose Re2G, which combines neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker, and generation using only ground truth on the target sequence output. We find incomparable gains in four diverse tasks: zero-shot slot filling, question answering, fact-checking, and dialog, with relative gains of 9% to 34% over the previous state-of-the-art on the KILT leaderboard.
arXiv Detail & Related papers (2022-07-13T15:51:40Z)
Logical Reasoning for Task Oriented Dialogue Systems [57.440956636333325]
We propose a novel method to fine-tune transformer models such as Roberta and T5 to reason over a set of facts in a given dialogue context. Our method includes a synthetic data generation mechanism which helps the model learn logical relations. We show that the transformer based model can perform logical reasoning to answer questions when the dialogue context contains all the required information.
arXiv Detail & Related papers (2022-02-08T21:46:27Z)
Mention Memory: incorporating textual knowledge into Transformers through entity mention attention [21.361822569279003]
We propose to integrate a semi-parametric representation of a large text corpus into a Transformer model as a source of factual knowledge. The proposed model - TOME - is a Transformer that accesses the information through internal memory layers in which each entity mention in the input passage attends to the mention memory. In experiments using a memory of 150 million Wikipedia mentions, TOME achieves strong performance on several open-domain knowledge-intensive tasks.
arXiv Detail & Related papers (2021-10-12T17:19:05Z)
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation. We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.