Larimar: Large Language Models with Episodic Memory Control
- URL: http://arxiv.org/abs/2403.11901v4
- Date: Wed, 21 Aug 2024 22:54:47 GMT
- Title: Larimar: Large Language Models with Episodic Memory Control
- Authors: Payel Das, Subhajit Chaudhury, Elliot Nelson, Igor Melnyk, Sarath Swaminathan, Sihui Dai, Aurélie Lozano, Georgios Kollias, Vijil Chenthamarakshan, Jiří, Navrátil, Soham Dan, Pin-Yu Chen,
- Abstract summary: Larimar is a brain-inspired architecture for enhancing Large Language Models with a distributed episodic memory.
Experimental results on multiple fact editing benchmarks demonstrate that Larimar attains accuracy comparable to most competitive baselines.
We provide mechanisms for selective fact forgetting, information leakage prevention, and input context length generalization with Larimar.
- Score: 62.70727449128647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient and accurate updating of knowledge stored in Large Language Models (LLMs) is one of the most pressing research challenges today. This paper presents Larimar - a novel, brain-inspired architecture for enhancing LLMs with a distributed episodic memory. Larimar's memory allows for dynamic, one-shot updates of knowledge without the need for computationally expensive re-training or fine-tuning. Experimental results on multiple fact editing benchmarks demonstrate that Larimar attains accuracy comparable to most competitive baselines, even in the challenging sequential editing setup, but also excels in speed - yielding speed-ups of 8-10x depending on the base LLM - as well as flexibility due to the proposed architecture being simple, LLM-agnostic, and hence general. We further provide mechanisms for selective fact forgetting, information leakage prevention, and input context length generalization with Larimar and show their effectiveness. Our code is available at https://github.com/IBM/larimar
Related papers
- Needle in the Haystack for Memory Based Large Language Models [31.885539843977472]
Current large language models (LLMs) often perform poorly on simple fact retrieval tasks.
We investigate if coupling a dynamically adaptable external memory to a LLM can alleviate this problem.
We demonstrate that the external memory of Larimar, which allows fast write and read of an episode of text samples, can be used at test time to handle contexts much longer than those seen during training.
arXiv Detail & Related papers (2024-07-01T16:32:16Z) - $\text{Memory}^3$: Language Modeling with Explicit Memory [22.572376536612015]
We equip large language models (LLMs) with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG)
As a preliminary proof of concept, we train from scratch a 2.4B LLM, which achieves better performance than much larger LLMs and RAG models.
We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable.
arXiv Detail & Related papers (2024-07-01T11:07:23Z) - WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models [78.22291694903659]
Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses.
Where the updated knowledge resides in memories is a fundamental question for model editing.
We propose WISE to bridge the gap between memories.
arXiv Detail & Related papers (2024-05-23T16:35:52Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular.
We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models.
We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z) - LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models
via MoE-Style Plugin [85.16356890023582]
We propose LoRAMoE, a novelty framework that introduces several low-rank adapters (LoRA) and integrates them by using a router network.
It freezes the backbone model and forces a portion of LoRAs to focus on leveraging world knowledge to solve downstream tasks.
Experimental results show that, as the instruction data increases, LoRAMoE can significantly improve the ability to process downstream tasks.
arXiv Detail & Related papers (2023-12-15T17:45:06Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.