Memory in humans and deep language models: Linking hypotheses for model
augmentation
- URL: http://arxiv.org/abs/2210.01869v1
- Date: Tue, 4 Oct 2022 19:35:11 GMT
- Title: Memory in humans and deep language models: Linking hypotheses for model
augmentation
- Authors: Omri Raccah, Pheobe Chen, Ted L. Willke, David Poeppel, and Vy A. Vo
- Abstract summary: We argue that memory-augmented Transformers can benefit substantially from considering insights from the memory literature in humans.
We detail an approach to integrating evidence from the human memory system through the specification of cross-domain linking hypotheses.
- Score: 1.0485739694839669
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The computational complexity of the self-attention mechanism in Transformer
models significantly limits their ability to generalize over long temporal
durations. Memory-augmentation, or the explicit storing of past information in
external memory for subsequent predictions, has become a constructive avenue
for mitigating this limitation. We argue that memory-augmented Transformers can
benefit substantially from considering insights from the memory literature in
humans. We detail an approach to integrating evidence from the human memory
system through the specification of cross-domain linking hypotheses. We then
provide an empirical demonstration to evaluate the use of surprisal as a
linking hypothesis, and further identify the limitations of this approach to
inform future research.
Related papers
- AHMF: Adaptive Hybrid-Memory-Fusion Model for Driver Attention Prediction [14.609639142688035]
This paper proposes an Adaptive Hybrid-Memory-Fusion (AHMF) driver attention prediction model to achieve more human-like predictions.
The model first encodes information about specific hazardous stimuli in the current scene to form working memories. Then, it adaptively retrieves similar situational experiences from the long-term memory for final prediction.
arXiv Detail & Related papers (2024-07-24T17:19:58Z) - Spatially-Aware Transformer for Embodied Agents [20.498778205143477]
This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information.
We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks.
We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
arXiv Detail & Related papers (2024-02-23T07:46:30Z) - A Framework for Inference Inspired by Human Memory Mechanisms [9.408704431898279]
We propose a PMI framework that consists of perception, memory and inference components.
The memory module comprises working and long-term memory, with the latter endowed with a higher-order structure to retain extensive and complex relational knowledge and experience.
We apply our PMI to improve prevailing Transformers and CNN models on question-answering tasks like bAbI-20k and Sort-of-CLEVR datasets.
arXiv Detail & Related papers (2023-10-01T08:12:55Z) - Memory-and-Anticipation Transformer for Online Action Understanding [52.24561192781971]
We propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future.
We present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks.
arXiv Detail & Related papers (2023-08-15T17:34:54Z) - A Memory Model for Question Answering from Streaming Data Supported by
Rehearsal and Anticipation of Coreference Information [19.559853775982386]
We propose a memory model that performs rehearsal and anticipation while processing inputs to important information for solving question answering tasks from streaming data.
We validate our model on a short-sequence (bAbI) dataset as well as large-sequence textual (NarrativeQA) and video (ActivityNet-QA) question answering datasets.
arXiv Detail & Related papers (2023-05-12T15:46:36Z) - On the Relationship Between Variational Inference and Auto-Associative
Memory [68.8204255655161]
We study how different neural network approaches to variational inference can be applied in this framework.
We evaluate the obtained algorithms on the CIFAR10 and CLEVR image datasets and compare them with other associative memory models.
arXiv Detail & Related papers (2022-10-14T14:18:47Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.