MemOS: A Memory OS for AI System
- URL: http://arxiv.org/abs/2507.03724v3
- Date: Tue, 05 Aug 2025 07:11:37 GMT
- Title: MemOS: A Memory OS for AI System
- Authors: Zhiyu Li, Shichao Song, Chenyang Xi, Hanyu Wang, Chen Tang, Simin Niu, Ding Chen, Jiawei Yang, Chunyu Li, Qingchen Yu, Jihao Zhao, Yezhaohui Wang, Peng Liu, Zehao Lin, Pengyuan Wang, Jiahao Huo, Tianyi Chen, Kai Chen, Kehang Li, Zhen Tao, Huayi Lai, Hao Wu, Bo Tang, Zhenren Wang, Zhaoxin Fan, Ningyu Zhang, Linfeng Zhang, Junchi Yan, Mingchuan Yang, Tong Xu, Wei Xu, Huajun Chen, Haofen Wang, Hongkang Yang, Wentao Zhang, Zhi-Qin John Xu, Siheng Chen, Feiyu Xiong,
- Abstract summary: Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
- Score: 116.87568350346537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI), yet their lack of well-defined memory management systems hinders the development of long-context reasoning, continual personalization, and knowledge consistency.Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.While Retrieval-Augmented Generation (RAG) introduces external knowledge in plain text, it remains a stateless workaround without lifecycle control or integration with persistent representations.Recent work has modeled the training and inference cost of LLMs from a memory hierarchy perspective, showing that introducing an explicit memory layer between parameter memory and external retrieval can substantially reduce these costs by externalizing specific knowledge. Beyond computational efficiency, LLMs face broader challenges arising from how information is distributed over time and context, requiring systems capable of managing heterogeneous knowledge spanning different temporal scales and sources. To address this challenge, we propose MemOS, a memory operating system that treats memory as a manageable system resource. It unifies the representation, scheduling, and evolution of plaintext, activation-based, and parameter-level memories, enabling cost-efficient storage and retrieval. As the basic unit, a MemCube encapsulates both memory content and metadata such as provenance and versioning. MemCubes can be composed, migrated, and fused over time, enabling flexible transitions between memory types and bridging retrieval with parameter-based learning. MemOS establishes a memory-centric system framework that brings controllability, plasticity, and evolvability to LLMs, laying the foundation for continual learning and personalized modeling.
Related papers
- MemOS: An Operating System for Memory-Augmented Generation (MAG) in Large Language Models [31.944531660401722]
We introduce MemOS, a memory operating system designed for Large Language Models (LLMs)<n>At its core is the MemCube, a standardized memory abstraction that enables tracking, fusion, and migration of heterogeneous memory.<n>MemOS establishes a memory-centric execution framework with strong controllability, adaptability, and evolvability.
arXiv Detail & Related papers (2025-05-28T08:27:12Z) - A-MEM: Agentic Memory for LLM Agents [42.50876509391843]
Large language model (LLM) agents require memory systems to leverage historical experiences.<n>Current memory systems enable basic storage and retrieval but lack sophisticated memory organization.<n>This paper proposes a novel agentic memory system for LLM agents that can dynamically organize memories in an agentic way.
arXiv Detail & Related papers (2025-02-17T18:36:14Z) - SR-CIS: Self-Reflective Incremental System with Decoupled Memory and Reasoning [32.18013657468068]
We propose the Self-Reflective Complementary Incremental System (SR-CIS)
It consists of the Complementary Inference Module (CIM) and Complementary Memory Module (CMM)
CMM consists of task-specific Short-Term Memory (STM) region and a universal Long-Term Memory (LTM) region.
By storing textual descriptions of images during training and combining them with the Scenario Replay Module (SRM) post-training for memory combination, SR-CIS achieves stable incremental memory with limited storage requirements.
arXiv Detail & Related papers (2024-08-04T09:09:35Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing large language models (LLMs) by integrating a structured and explicit read-and-write memory module.<n>Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models.
We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z) - Empowering Working Memory for Large Language Model Agents [9.83467478231344]
This paper explores the potential of applying cognitive psychology's working memory frameworks to large language models (LLMs)
An innovative model is proposed incorporating a centralized Working Memory Hub and Episodic Buffer access to retain memories across episodes.
This architecture aims to provide greater continuity for nuanced contextual reasoning during intricate tasks and collaborative scenarios.
arXiv Detail & Related papers (2023-12-22T05:59:00Z) - MemGPT: Towards LLMs as Operating Systems [50.02623936965231]
Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows.
We propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems.
We release MemGPT code and data for our experiments at https://memgpt.ai.
arXiv Detail & Related papers (2023-10-12T17:51:32Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z) - SCM: Enhancing Large Language Model with Self-Controlled Memory Framework [54.33686574304374]
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.<n>We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
arXiv Detail & Related papers (2023-04-26T07:25:31Z) - Neural Storage: A New Paradigm of Elastic Memory [4.307341575886927]
Storage and retrieval of data in a computer memory plays a major role in system performance.
We introduce Neural Storage (NS), a brain-inspired learning memory paradigm that organizes the memory as a flexible neural memory network.
NS achieves an order of magnitude improvement in memory access performance for two representative applications.
arXiv Detail & Related papers (2021-01-07T19:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.