Memory in humans and deep language models: Linking hypotheses for model
augmentation
- URL: http://arxiv.org/abs/2210.01869v1
- Date: Tue, 4 Oct 2022 19:35:11 GMT
- Title: Memory in humans and deep language models: Linking hypotheses for model
augmentation
- Authors: Omri Raccah, Pheobe Chen, Ted L. Willke, David Poeppel, and Vy A. Vo
- Abstract summary: We argue that memory-augmented Transformers can benefit substantially from considering insights from the memory literature in humans.
We detail an approach to integrating evidence from the human memory system through the specification of cross-domain linking hypotheses.
- Score: 1.0485739694839669
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The computational complexity of the self-attention mechanism in Transformer
models significantly limits their ability to generalize over long temporal
durations. Memory-augmentation, or the explicit storing of past information in
external memory for subsequent predictions, has become a constructive avenue
for mitigating this limitation. We argue that memory-augmented Transformers can
benefit substantially from considering insights from the memory literature in
humans. We detail an approach to integrating evidence from the human memory
system through the specification of cross-domain linking hypotheses. We then
provide an empirical demonstration to evaluate the use of surprisal as a
linking hypothesis, and further identify the limitations of this approach to
inform future research.
Related papers
- Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey [211.01908189012184]
Memory, with hundreds of papers released this year, emerges as the critical solution to fill the utility gap.<n>We provide a unified view of foundation agent memory along three dimensions.<n>We then analyze how memory is instantiated and operated under different agent topologies.
arXiv Detail & Related papers (2026-01-14T07:38:38Z) - Forgetting as a Feature: Cognitive Alignment of Large Language Models [39.146761527401424]
We show that Large Language Models (LLMs) exhibit systematic forgetting of past information.<n> Drawing inspiration from human memory dynamics, we model LLM inference as a probabilistic memory process governed by exponential decay.<n>Building on these observations, we propose probabilistic memory prompting, a lightweight strategy that shapes evidence integration to mimic human-like memory decay.
arXiv Detail & Related papers (2025-12-28T10:43:00Z) - Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management [49.71055327567513]
We argue that memory management should be viewed as a sequential decision-making problem under uncertainty.<n>Our contribution is not a new algorithm, but a principled reframing that clarifies the limitations of approaches.
arXiv Detail & Related papers (2025-12-25T08:23:03Z) - Memory in the Age of AI Agents [217.9368190980982]
This work aims to provide an up-to-date landscape of current agent memory research.<n>We identify three dominant realizations of agent memory, namely token-level, parametric, and latent memory.<n>To support practical development, we compile a comprehensive summary of memory benchmarks and open-source frameworks.
arXiv Detail & Related papers (2025-12-15T17:22:34Z) - On Memory: A comparison of memory mechanisms in world models [0.0]
We investigate the effective memory span of transformer-based world models through an analysis of several memory augmentation mechanisms.<n>We introduce a taxonomy that distinguishes between memory encoding and memory injection mechanisms, motivating their roles in extending the world model's memory.<n>Our findings show that memory mechanisms improve the effective memory span in vision transformers and provide a path to completing loop closures within a world model's imagination.
arXiv Detail & Related papers (2025-12-07T20:29:20Z) - MemGen: Weaving Generative Latent Memory for Self-Evolving Agents [57.1835920227202]
We propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty.<n>MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition.
arXiv Detail & Related papers (2025-09-29T12:33:13Z) - Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue [13.558061425427688]
PREMem is a novel approach that shifts complex reasoning processes from inference to memory construction.<n>It creates enriched representations while reducing computational demands during interactions.<n> Experiments show significant performance improvements across all model sizes.
arXiv Detail & Related papers (2025-09-13T15:18:08Z) - Predictive Attractor Models [9.947717243638289]
We propose textitPredictive Attractor Models (PAM), a novel sequence memory architecture with desirable generative properties.
PAM avoids catastrophic forgetting by uniquely representing past context through lateral inhibition in cortical minicolumns.
We show that PAM is trained with local computations through Hebbian plasticity rules in a biologically plausible framework.
arXiv Detail & Related papers (2024-10-03T12:25:01Z) - AHMF: Adaptive Hybrid-Memory-Fusion Model for Driver Attention Prediction [14.609639142688035]
This paper proposes an Adaptive Hybrid-Memory-Fusion (AHMF) driver attention prediction model to achieve more human-like predictions.
The model first encodes information about specific hazardous stimuli in the current scene to form working memories. Then, it adaptively retrieves similar situational experiences from the long-term memory for final prediction.
arXiv Detail & Related papers (2024-07-24T17:19:58Z) - Spatially-Aware Transformer for Embodied Agents [20.498778205143477]
This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information.
We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks.
We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
arXiv Detail & Related papers (2024-02-23T07:46:30Z) - Memory-and-Anticipation Transformer for Online Action Understanding [52.24561192781971]
We propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future.
We present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks.
arXiv Detail & Related papers (2023-08-15T17:34:54Z) - A Memory Model for Question Answering from Streaming Data Supported by
Rehearsal and Anticipation of Coreference Information [19.559853775982386]
We propose a memory model that performs rehearsal and anticipation while processing inputs to important information for solving question answering tasks from streaming data.
We validate our model on a short-sequence (bAbI) dataset as well as large-sequence textual (NarrativeQA) and video (ActivityNet-QA) question answering datasets.
arXiv Detail & Related papers (2023-05-12T15:46:36Z) - On the Relationship Between Variational Inference and Auto-Associative
Memory [68.8204255655161]
We study how different neural network approaches to variational inference can be applied in this framework.
We evaluate the obtained algorithms on the CIFAR10 and CLEVR image datasets and compare them with other associative memory models.
arXiv Detail & Related papers (2022-10-14T14:18:47Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.