Related papers: Aspects of human memory and Large Language Models

Related papers

MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z)
Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search [13.961239165301315]
We show that foundational architectures in neural machine translation exhibit mechanisms that directly correspond to those specified in the Context Maintenance and Retrieval model of human memory.<n>We implement a neural machine translation model as a cognitive model of human memory search that is both interpretable and capable of capturing complex dynamics of learning.
arXiv Detail & Related papers (2025-06-20T18:43:15Z)
Improve Language Model and Brain Alignment via Associative Memory [24.566858101771842]
Associative memory engages in the integration of relevant information for comprehension in the human cognition system.<n>In this work, we seek to improve alignment between language models and human brain while processing speech information by integrating associative memory.
arXiv Detail & Related papers (2025-05-20T02:39:09Z)
Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization' This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z)
Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences [51.965994405124455]
Humans excel at learning abstract patterns across different sequences, filtering out irrelevant details. Many sequence learning models lack the ability to abstract, which leads to memory inefficiency and poor transfer. We introduce a non-parametric hierarchical variable learning model (HVM) that learns chunks from sequences and abstracts contextually similar chunks as variables.
arXiv Detail & Related papers (2024-10-27T18:13:07Z)
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system. In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z)
HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing [33.720656946186885]
Hierarchical Memory Transformer (HMT) is a novel framework that facilitates a model's long-context processing ability. HMT consistently improves the long-context processing ability of existing models.
arXiv Detail & Related papers (2024-05-09T19:32:49Z)
Empowering Working Memory for Large Language Model Agents [9.83467478231344]
This paper explores the potential of applying cognitive psychology's working memory frameworks to large language models (LLMs) An innovative model is proposed incorporating a centralized Working Memory Hub and Episodic Buffer access to retain memories across episodes. This architecture aims to provide greater continuity for nuanced contextual reasoning during intricate tasks and collaborative scenarios.
arXiv Detail & Related papers (2023-12-22T05:59:00Z)
Quantifying and Analyzing Entity-level Memorization in Large Language Models [4.59914731734176]
Large language models (LLMs) have been proven capable of memorizing their training data. Privacy risks arising from memorization have attracted increasing attention. We propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios.
arXiv Detail & Related papers (2023-08-30T03:06:47Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
Extending Memory for Language Modelling [0.0]
We introduce Long Term Memory network (LTM) to learn from infinitely long sequences. LTM gives priority to the current inputs to allow it to have a high impact. We compare LTM with other language models which require long term memory.
arXiv Detail & Related papers (2023-05-19T06:30:19Z)
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability. Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z)
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models [64.22311189896888]
We study exact memorization in causal and masked language modeling, across model sizes and throughout the training process. Surprisingly, we show that larger models can memorize a larger portion of the data before over-fitting and tend to forget less throughout the training process.
arXiv Detail & Related papers (2022-05-22T07:43:50Z)
Relational Memory Augmented Language Models [40.626389607433936]
We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. Our approach produces a better language model in terms of perplexity and bits per character.
arXiv Detail & Related papers (2022-01-24T13:25:41Z)
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z)
Neural Machine Translation with Monolingual Translation Memory [58.98657907678992]
We propose a new framework that uses monolingual memory and performs learnable memory retrieval in a cross-lingual manner. Experiments show that the proposed method obtains substantial improvements.
arXiv Detail & Related papers (2021-05-24T13:35:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.