A Memory Model for Question Answering from Streaming Data Supported by
Rehearsal and Anticipation of Coreference Information
- URL: http://arxiv.org/abs/2305.07565v1
- Date: Fri, 12 May 2023 15:46:36 GMT
- Title: A Memory Model for Question Answering from Streaming Data Supported by
Rehearsal and Anticipation of Coreference Information
- Authors: Vladimir Araujo, Alvaro Soto, Marie-Francine Moens
- Abstract summary: We propose a memory model that performs rehearsal and anticipation while processing inputs to important information for solving question answering tasks from streaming data.
We validate our model on a short-sequence (bAbI) dataset as well as large-sequence textual (NarrativeQA) and video (ActivityNet-QA) question answering datasets.
- Score: 19.559853775982386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing question answering methods often assume that the input content
(e.g., documents or videos) is always accessible to solve the task.
Alternatively, memory networks were introduced to mimic the human process of
incremental comprehension and compression of the information in a
fixed-capacity memory. However, these models only learn how to maintain memory
by backpropagating errors in the answers through the entire network. Instead,
it has been suggested that humans have effective mechanisms to boost their
memorization capacities, such as rehearsal and anticipation. Drawing
inspiration from these, we propose a memory model that performs rehearsal and
anticipation while processing inputs to memorize important information for
solving question answering tasks from streaming data. The proposed mechanisms
are applied self-supervised during training through masked modeling tasks
focused on coreference information. We validate our model on a short-sequence
(bAbI) dataset as well as large-sequence textual (NarrativeQA) and video
(ActivityNet-QA) question answering datasets, where it achieves substantial
improvements over previous memory network approaches. Furthermore, our ablation
study confirms the proposed mechanisms' importance for memory models.
Related papers
- Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
Large language models (LLMs) have sparked debate over whether they genuinely generalize to unseen tasks or rely on memorizing vast amounts of pretraining data.
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the LLM output probabilities and the pretraining data frequency.
This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z) - A Framework for Inference Inspired by Human Memory Mechanisms [9.408704431898279]
We propose a PMI framework that consists of perception, memory and inference components.
The memory module comprises working and long-term memory, with the latter endowed with a higher-order structure to retain extensive and complex relational knowledge and experience.
We apply our PMI to improve prevailing Transformers and CNN models on question-answering tasks like bAbI-20k and Sort-of-CLEVR datasets.
arXiv Detail & Related papers (2023-10-01T08:12:55Z) - Think Before You Act: Decision Transformers with Working Memory [44.18926449252084]
Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks.
We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.
We propose a working memory module to store, blend, and retrieve information for different downstream tasks.
arXiv Detail & Related papers (2023-05-24T01:20:22Z) - On the Relationship Between Variational Inference and Auto-Associative
Memory [68.8204255655161]
We study how different neural network approaches to variational inference can be applied in this framework.
We evaluate the obtained algorithms on the CIFAR10 and CLEVR image datasets and compare them with other associative memory models.
arXiv Detail & Related papers (2022-10-14T14:18:47Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Learning to Rehearse in Long Sequence Memorization [107.14601197043308]
Existing reasoning tasks often have an important assumption that the input contents can be always accessed while reasoning.
Memory augmented neural networks introduce a human-like write-read memory to compress and memorize the long input sequence in one pass.
But they have two serious drawbacks: 1) they continually update the memory from current information and inevitably forget the early contents; 2) they do not distinguish what information is important and treat all contents equally.
We propose the Rehearsal Memory to enhance long-sequence memorization by self-supervised rehearsal with a history sampler.
arXiv Detail & Related papers (2021-06-02T11:58:30Z) - Distributed Associative Memory Network with Memory Refreshing Loss [5.5792083698526405]
We introduce a novel Distributed Associative Memory architecture (DAM) with Memory Refreshing Loss (MRL)
Inspired by how the human brain works, our framework encodes data with distributed representation across multiple memory blocks.
MRL enables MANN to reinforce an association between input data and task objective by reproducing input data from stored memory contents.
arXiv Detail & Related papers (2020-07-21T07:34:33Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Visual Question Answering with Prior Class Semantics [50.845003775809836]
We show how to exploit additional information pertaining to the semantics of candidate answers.
We extend the answer prediction process with a regression objective in a semantic space.
Our method brings improvements in consistency and accuracy over a range of question types.
arXiv Detail & Related papers (2020-05-04T02:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.