Memory Bank Compression for Continual Adaptation of Large Language Models
- URL: http://arxiv.org/abs/2601.00756v1
- Date: Fri, 02 Jan 2026 17:22:34 GMT
- Title: Memory Bank Compression for Continual Adaptation of Large Language Models
- Authors: Thomas Katraouras, Dimitrios Rafailidis,
- Abstract summary: Continual learning aims to update Large Language Models with new information without erasing previously acquired knowledge.<n>Memory-augmented approaches address this by equipping LLMs with a memory bank, that is an external memory module which stores information for future use.<n>We propose MBC, a model that compresses the memory bank through a codebook optimization strategy during online adaptation learning.
- Score: 1.9336815376402718
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have become a mainstay for many everyday applications. However, as data evolve their knowledge quickly becomes outdated. Continual learning aims to update LLMs with new information without erasing previously acquired knowledge. Although methods such as full fine-tuning can incorporate new data, they are computationally expensive and prone to catastrophic forgetting, where prior knowledge is overwritten. Memory-augmented approaches address this by equipping LLMs with a memory bank, that is an external memory module which stores information for future use. However, these methods face a critical limitation, in particular, the memory bank constantly grows in the real-world scenario when large-scale data streams arrive. In this paper, we propose MBC, a model that compresses the memory bank through a codebook optimization strategy during online adaptation learning. To ensure stable learning, we also introduce an online resetting mechanism that prevents codebook collapse. In addition, we employ Key-Value Low-Rank Adaptation in the attention layers of the LLM, enabling efficient utilization of the compressed memory representations. Experiments with benchmark question-answering datasets demonstrate that MBC reduces the memory bank size to 0.3% when compared against the most competitive baseline, while maintaining high retention accuracy during online adaptation learning. Our code is publicly available at https://github.com/Thomkat/MBC.
Related papers
- MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents [53.44122827359892]
We propose MemCtrl, a framework that uses Multimodal Large Language Models (MLLMs) for pruning memory online.<n>-augmented MLLMs show an improvement of around 16% on average, with over 20% on specific instruction subsets.
arXiv Detail & Related papers (2026-01-28T18:31:17Z) - MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z) - CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models [22.93893181000535]
Large Language Models (LLMs) need to adapt to the continuous changes in data, tasks, and user preferences.<n>To address these challenges, this paper proposes the Compression Memory Training (CMT) method.<n>CMT compresses and extracts information from new documents to be stored in a memory bank.<n>When answering to queries related to these new documents, the model aggregates these document memories from the memory bank to better answer user questions.
arXiv Detail & Related papers (2024-12-10T10:35:19Z) - MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications.<n>We propose MEMO, a novel framework for fine-grained activation memory management.<n>MeMO achieves an average of 1.97x and 1.80x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing large language models (LLMs) by integrating a structured and explicit read-and-write memory module.<n>Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models.
We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z) - Improving information retention in large scale online continual learning [99.73847522194549]
Online continual learning aims to adapt efficiently to new data while retaining existing knowledge.
Recent work suggests that information retention remains a problem in large scale OCL even when the replay buffer is unlimited.
We propose using a moving average family of methods to improve optimization for non-stationary objectives.
arXiv Detail & Related papers (2022-10-12T16:59:43Z) - Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size.
We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos.
We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.