Memory and Knowledge Augmented Language Models for Inferring Salience in
Long-Form Stories
- URL: http://arxiv.org/abs/2109.03754v1
- Date: Wed, 8 Sep 2021 16:15:50 GMT
- Title: Memory and Knowledge Augmented Language Models for Inferring Salience in
Long-Form Stories
- Authors: David Wilmot, Frank Keller
- Abstract summary: This paper takes a recent unsupervised method for salience detection derived from Barthes Cardinal Functions and theories of surprise.
We improve the standard transformer language model by incorporating an external knowledgebase and adding a memory mechanism.
Our evaluation against this data demonstrates that our salience detection model improves performance over and above a non-knowledgebase and memory augmented language model.
- Score: 21.99104738567138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Measuring event salience is essential in the understanding of stories. This
paper takes a recent unsupervised method for salience detection derived from
Barthes Cardinal Functions and theories of surprise and applies it to longer
narrative forms. We improve the standard transformer language model by
incorporating an external knowledgebase (derived from Retrieval Augmented
Generation) and adding a memory mechanism to enhance performance on longer
works. We use a novel approach to derive salience annotation using
chapter-aligned summaries from the Shmoop corpus for classic literary works.
Our evaluation against this data demonstrates that our salience detection model
improves performance over and above a non-knowledgebase and memory augmented
language model, both of which are crucial to this improvement.
Related papers
- Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization'
This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z) - Long Context Modeling with Ranked Memory-Augmented Retrieval [18.4248685578126]
We introduce a novel framework that dynamically ranks memory entries based on relevance.
Our model introduces a novel relevance scoring and a pointwise re-ranking model for key-value embeddings, inspired by learning-to-rank techniques in information retrieval.
arXiv Detail & Related papers (2025-03-19T00:24:01Z) - Analysis of Plan-based Retrieval for Grounded Text Generation [78.89478272104739]
hallucinations occur when a language model is given a generation task outside its parametric knowledge.
A common strategy to address this limitation is to infuse the language models with retrieval mechanisms.
We analyze how planning can be used to guide retrieval to further reduce the frequency of hallucinations.
arXiv Detail & Related papers (2024-08-20T02:19:35Z) - Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer [1.911678487931003]
Retrieval-based language models are increasingly employed in question-answering tasks.
We develop the first Norwegian retrieval-based model by adapting the REALM framework.
We show that this type of training improves the reader's performance on extractive question-answering.
arXiv Detail & Related papers (2023-04-19T13:40:47Z) - Training Language Models with Memory Augmentation [28.4608705738799]
We present a novel training approach designed for training language models with memory augmentation.
Our approach uses a training objective that directly takes in-batch examples as accessible memory.
We demonstrate significant gains over previous memory-augmented approaches.
arXiv Detail & Related papers (2022-05-25T11:37:29Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Modeling Event Salience in Narratives via Barthes' Cardinal Functions [38.44885682996472]
Estimating event salience is useful for tasks such as story generation and text analysis in narratology and folkloristics.
To compute event salience without any annotations, we propose several unsupervised methods that require only a pre-trained language model.
We show that the proposed methods outperform baseline methods and find fine-tuning a language model on narrative texts is a key factor in improving the proposed methods.
arXiv Detail & Related papers (2020-11-03T15:28:07Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - Semantic Role Labeling Guided Multi-turn Dialogue ReWriter [63.07073750355096]
We propose to use semantic role labeling (SRL) to highlight the core semantic information of who did what to whom.
Experiments show that this information significantly improves a RoBERTa-based model that already outperforms previous state-of-the-art systems.
arXiv Detail & Related papers (2020-10-03T19:50:04Z) - Linguistic Features for Readability Assessment [0.0]
It is unknown whether augmenting deep learning models with linguistically motivated features would improve performance further.
We find that, given sufficient training data, augmenting deep learning models with linguistically motivated features does not improve state-of-the-art performance.
Our results provide preliminary evidence for the hypothesis that the state-of-the-art deep learning models represent linguistic features of the text related to readability.
arXiv Detail & Related papers (2020-05-30T22:14:46Z) - REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia.
For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner.
We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.