Human-like Episodic Memory for Infinite Context LLMs
- URL: http://arxiv.org/abs/2407.09450v1
- Date: Fri, 12 Jul 2024 17:34:03 GMT
- Title: Human-like Episodic Memory for Infinite Context LLMs
- Authors: Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang,
- Abstract summary: We introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into large language models.
EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement.
Experiments on the LongBench dataset demonstrate EM-LLM's superior performance, outperforming the state-of-the-art InfLLM model.
- Score: 13.211261438927798
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs, enabling them to effectively handle practically infinite context lengths while maintaining computational efficiency. EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement in an on-line fashion. When needed, these events are retrieved through a two-stage memory process, combining similarity-based and temporally contiguous retrieval for efficient and human-like access to relevant information. Experiments on the LongBench dataset demonstrate EM-LLM's superior performance, outperforming the state-of-the-art InfLLM model with an overall relative improvement of 4.3% across various tasks, including a 33% improvement on the PassageRetrieval task. Furthermore, our analysis reveals strong correlations between EM-LLM's event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart. This work not only advances LLM capabilities in processing extended contexts but also provides a computational framework for exploring human memory mechanisms, opening new avenues for interdisciplinary research in AI and cognitive science.
Related papers
- SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Memory-Inspired Temporal Prompt Interaction for Text-Image
Classification [13.449375069856684]
We propose a novel prompt-based multimodal interaction strategy inspired by human memory strategy, namely Memory-Inspired Temporal Prompt Interaction (MITP)
We utilize temporal prompts on intermediate layers to imitate the acquiring stage, leverage similarity-based prompt interaction to imitate memory consolidation, and employ prompt generation strategy to imitate memory activation.
We achieve competitive results on several datasets with relatively small memory usage and 2.0M of trainable parameters.
arXiv Detail & Related papers (2024-01-26T13:36:12Z) - GATGPT: A Pre-trained Large Language Model with Graph Attention Network
for Spatiotemporal Imputation [19.371155159744934]
In real-world settings, such data often contain missing elements due to issues like sensor malfunctions and data transmission errors.
The objective oftemporal imputation is to estimate these missing values by understanding the inherent spatial and temporal relationships in the observed time series.
Traditionally, intricatetemporal imputation has relied on specific architectures, which suffer from limited applicability and high computational complexity.
In contrast our approach integrates pre-trained large language models (LLMs) into intricatetemporal imputation, introducing a groundbreaking framework, GATGPT.
arXiv Detail & Related papers (2023-11-24T08:15:11Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - A Low-rank Matching Attention based Cross-modal Feature Fusion Method
for Conversational Emotion Recognition [56.20144064187554]
This paper develops a novel cross-modal feature fusion method for the Conversational emotion recognition (CER) task.
By setting a matching weight and calculating attention scores between modal features row by row, LMAM contains fewer parameters than the self-attention method.
We show that LMAM can be embedded into any existing state-of-the-art DL-based CER methods and help boost their performance in a plug-and-play manner.
arXiv Detail & Related papers (2023-06-16T16:02:44Z) - RLIP: Relational Language-Image Pre-training for Human-Object
Interaction Detection [32.20132357830726]
Language-Image Pre-training (LIPR) is a strategy for contrastive pre-training that leverages both entity and relation descriptions.
We show the benefits of these contributions, collectively termed RLIP-ParSe, for improved zero-shot, few-shot and fine-tuning HOI detection as well as increased robustness from noisy annotations.
arXiv Detail & Related papers (2022-09-05T07:50:54Z) - Efficient Global-Local Memory for Real-time Instrument Segmentation of
Robotic Surgical Video [53.14186293442669]
We identify two important clues for surgical instrument perception, including local temporal dependency from adjacent frames and global semantic correlation in long-range duration.
We propose a novel dual-memory network (DMNet) to relate both global and local-temporal knowledge.
Our method largely outperforms the state-of-the-art works on segmentation accuracy while maintaining a real-time speed.
arXiv Detail & Related papers (2021-09-28T10:10:14Z) - Schematic Memory Persistence and Transience for Efficient and Robust
Continual Learning [8.030924531643532]
Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI)
It is still quite primitive, with existing works focusing primarily on avoiding (catastrophic) forgetting.
We propose a novel framework for continual learning with external memory that builds on recent advances in neuroscience.
arXiv Detail & Related papers (2021-05-05T14:32:47Z) - Temporal Memory Relation Network for Workflow Recognition from Surgical
Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns.
We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z) - A Multi-Task Deep Learning Framework to Localize the Eloquent Cortex in
Brain Tumor Patients Using Dynamic Functional Connectivity [7.04584289867204]
We present a novel deep learning framework that uses dynamic functional connectivity to simultaneously localize the language and motor areas of the eloquent cortex in brain tumor patients.
Our model achieves higher localization accuracies than conventional deep learning approaches and can identify bilateral language areas even when trained on left-hemisphere lateralized cases.
arXiv Detail & Related papers (2020-11-17T18:18:09Z) - Joint Constrained Learning for Event-Event Relation Extraction [94.3499255880101]
We propose a joint constrained learning framework for modeling event-event relations.
Specifically, the framework enforces logical constraints within and across multiple temporal and subevent relations.
We show that our joint constrained learning approach effectively compensates for the lack of jointly labeled data.
arXiv Detail & Related papers (2020-10-13T22:45:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.