The Compressor-Retriever Architecture for Language Model OS
- URL: http://arxiv.org/abs/2409.01495v1
- Date: Mon, 2 Sep 2024 23:28:15 GMT
- Title: The Compressor-Retriever Architecture for Language Model OS
- Authors: Yuan Yang, Siheng Xiong, Ehsan Shareghi, Faramarz Fekri,
- Abstract summary: This paper explores the concept of using a language model as the core component of an operating system (OS)
A key challenge in realizing such an LM OS is managing the life-long context and ensuring statefulness across sessions.
We introduce compressor-retriever, a model-agnostic architecture designed for life-long context management.
- Score: 20.56093501980724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in large language models (LLMs) have significantly enhanced their capacity to aggregate and process information across multiple modalities, enabling them to perform a wide range of tasks such as multimodal data querying, tool usage, web interactions, and handling long documents. These capabilities pave the way for transforming LLMs from mere chatbots into general-purpose agents capable of interacting with the real world. This paper explores the concept of using a language model as the core component of an operating system (OS), effectively acting as a CPU that processes data stored in a context window, which functions as RAM. A key challenge in realizing such an LM OS is managing the life-long context and ensuring statefulness across sessions, a feature limited by the current session-based interaction paradigm due to context window size limit. To address this, we introduce compressor-retriever, a model-agnostic architecture designed for life-long context management. Unlike other long-context solutions such as retrieval-augmented generation, our approach exclusively uses the base model's forward function to compress and retrieve context, ensuring end-to-end differentiability. Preliminary experiments demonstrate the effectiveness of this architecture in in-context learning tasks, marking a step towards the development of a fully stateful LLM OS. Project repo available at: https://github.com/gblackout/LM-OS
Related papers
- SEGMENT+: Long Text Processing with Short-Context Language Models [53.40059130780192]
SEGMENT+ is a framework that enables LMs to handle extended inputs within limited context windows efficiently.
SEGMENT+ utilizes structured notes and a filtering module to manage information flow, resulting in a system that is both controllable and interpretable.
arXiv Detail & Related papers (2024-10-09T03:40:22Z) - Beyond Prompts: Dynamic Conversational Benchmarking of Large Language Models [0.0]
We introduce a dynamic benchmarking system for conversational agents that evaluates their performance through a single, simulated, and lengthy user interaction.
We context switch regularly to interleave the tasks, which constructs a realistic testing scenario in which we assess the Long-Term Memory, Continual Learning, and Information Integration capabilities of the agents.
arXiv Detail & Related papers (2024-09-30T12:01:29Z) - Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models [1.3980986259786221]
This paper examines the integration of Large Language Models (LLMs) within existing systems.
By leveraging the advanced natural language understanding capabilities of LLMs, our method improves RDF entity extraction within web systems.
The evaluation of this methodology shows a marked enhancement in system expressivity and the accuracy of responses to user queries.
arXiv Detail & Related papers (2024-09-24T16:31:33Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - On the Multi-turn Instruction Following for Conversational Web Agents [83.51251174629084]
We introduce a new task of Conversational Web Navigation, which necessitates sophisticated interactions that span multiple turns with both the users and the environment.
We propose a novel framework, named self-reflective memory-augmented planning (Self-MAP), which employs memory utilization and self-reflection techniques.
arXiv Detail & Related papers (2024-02-23T02:18:12Z) - Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled.
First, we quantize and embed both text and visual prompt into a unified representational space.
Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z) - MemGPT: Towards LLMs as Operating Systems [50.02623936965231]
Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows.
We propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems.
We release MemGPT code and data for our experiments at https://memgpt.ai.
arXiv Detail & Related papers (2023-10-12T17:51:32Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.