Aspects of human memory and Large Language Models
- URL: http://arxiv.org/abs/2311.03839v3
- Date: Mon, 8 Apr 2024 13:47:49 GMT
- Title: Aspects of human memory and Large Language Models
- Authors: Romuald A. Janik,
- Abstract summary: Large Language Models (LLMs) are huge artificial neural networks which primarily serve to generate text.
We find surprising similarities with key characteristics of human memory.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are huge artificial neural networks which primarily serve to generate text, but also provide a very sophisticated probabilistic model of language use. Since generating a semantically consistent text requires a form of effective memory, we investigate the memory properties of LLMs and find surprising similarities with key characteristics of human memory. We argue that the human-like memory properties of the Large Language Model do not follow automatically from the LLM architecture but are rather learned from the statistics of the training textual data. These results strongly suggest that the biological features of human memory leave an imprint on the way that we structure our textual narratives.
Related papers
- Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system.
In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z) - SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation [56.913182262166316]
Chain-of-Information Generation (CoIG) is a method for decoupling semantic and perceptual information in large-scale speech generation.
SpeechGPT-Gen is efficient in semantic and perceptual information modeling.
It markedly excels in zero-shot text-to-speech, zero-shot voice conversion, and speech-to-speech dialogue.
arXiv Detail & Related papers (2024-01-24T15:25:01Z) - Empowering Working Memory for Large Language Model Agents [9.83467478231344]
This paper explores the potential of applying cognitive psychology's working memory frameworks to large language models (LLMs)
An innovative model is proposed incorporating a centralized Working Memory Hub and Episodic Buffer access to retain memories across episodes.
This architecture aims to provide greater continuity for nuanced contextual reasoning during intricate tasks and collaborative scenarios.
arXiv Detail & Related papers (2023-12-22T05:59:00Z) - Using large language models to study human memory for meaningful
narratives [0.0]
We show that language models can be used as a scientific instrument for studying human memory for meaningful material.
We performed online memory experiments with a large number of participants and collected recognition and recall data for narratives of different lengths.
In order to investigate the role of narrative comprehension in memory, we repeated these experiments using scrambled versions of the presented stories.
arXiv Detail & Related papers (2023-11-08T15:11:57Z) - Extending Memory for Language Modelling [0.0]
We introduce Long Term Memory network (LTM) to learn from infinitely long sequences.
LTM gives priority to the current inputs to allow it to have a high impact.
We compare LTM with other language models which require long term memory.
arXiv Detail & Related papers (2023-05-19T06:30:19Z) - Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism
of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability.
Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem.
We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z) - Memorization Without Overfitting: Analyzing the Training Dynamics of
Large Language Models [64.22311189896888]
We study exact memorization in causal and masked language modeling, across model sizes and throughout the training process.
Surprisingly, we show that larger models can memorize a larger portion of the data before over-fitting and tend to forget less throughout the training process.
arXiv Detail & Related papers (2022-05-22T07:43:50Z) - Relational Memory Augmented Language Models [40.626389607433936]
We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph.
Our approach produces a better language model in terms of perplexity and bits per character.
arXiv Detail & Related papers (2022-01-24T13:25:41Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Neural Machine Translation with Monolingual Translation Memory [58.98657907678992]
We propose a new framework that uses monolingual memory and performs learnable memory retrieval in a cross-lingual manner.
Experiments show that the proposed method obtains substantial improvements.
arXiv Detail & Related papers (2021-05-24T13:35:19Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.