AI-native Memory: A Pathway from LLMs Towards AGI
- URL: http://arxiv.org/abs/2406.18312v4
- Date: Wed, 28 Aug 2024 08:07:49 GMT
- Title: AI-native Memory: A Pathway from LLMs Towards AGI
- Authors: Jingbo Shang, Zai Zheng, Jiale Wei, Xiang Ying, Felix Tao, Mindverse Team,
- Abstract summary: Large language models (LLMs) have demonstrated the world with the sparks of artificial general intelligence (AGI)
We envision a pathway from LLMs to AGI through the integration of emphmemory.
As an intermediate stage, the memory will likely be in the form of natural language descriptions.
- Score: 25.19572633670963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated the world with the sparks of artificial general intelligence (AGI). One opinion, especially from some startups working on LLMs, argues that an LLM with nearly unlimited context length can realize AGI. However, they might be too optimistic about the long-context capability of (existing) LLMs -- (1) Recent literature has shown that their effective context length is significantly smaller than their claimed context length; and (2) Our reasoning-in-a-haystack experiments further demonstrate that simultaneously finding the relevant information from a long context and conducting (simple) reasoning is nearly impossible. In this paper, we envision a pathway from LLMs to AGI through the integration of \emph{memory}. We believe that AGI should be a system where LLMs serve as core processors. In addition to raw data, the memory in this system would store a large number of important conclusions derived from reasoning processes. Compared with retrieval-augmented generation (RAG) that merely processing raw data, this approach not only connects semantically related information closer, but also simplifies complex inferences at the time of querying. As an intermediate stage, the memory will likely be in the form of natural language descriptions, which can be directly consumed by users too. Ultimately, every agent/person should have its own large personal model, a deep neural network model (thus \emph{AI-native}) that parameterizes and compresses all types of memory, even the ones cannot be described by natural languages. Finally, we discuss the significant potential of AI-native memory as the transformative infrastructure for (proactive) engagement, personalization, distribution, and social in the AGI era, as well as the incurred privacy and security challenges with preliminary solutions.
Related papers
- Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests.
We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments.
We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z) - Parametric Retrieval Augmented Generation [32.29608109539912]
Parametric RAG is a new RAG paradigm that integrates external knowledge directly into the parameters of feed-forward networks.
It substantially enhances both the effectiveness and efficiency of knowledge augmentation in large language models.
arXiv Detail & Related papers (2025-01-27T10:04:49Z) - Virgo: A Preliminary Exploration on Reproducing o1-like MLLM [89.50691075011429]
Slow-thinking reasoning systems have garnered widespread attention by scaling the thinking time during inference.
There is also growing interest in adapting this capability to multimodal large language models (MLLMs)
In this paper, we explore a straightforward approach by fine-tuning a capable MLLM with a small amount of textual long-form thought data.
We find that these long-form reasoning processes, expressed in natural language, can be effectively transferred to MLLMs.
arXiv Detail & Related papers (2025-01-03T17:14:16Z) - Large Language Models and the Extended Church-Turing Thesis [0.0]
We investigate the computational power of large language models (LLMs) by the classical means of computability and computational complexity theory.
We show that any fixed (non-adaptive) LLM is computationally equivalent to a, possibly very large, deterministic finite-state transducer.
We discuss the merits of our findings in the broader context of several related disciplines and philosophies.
arXiv Detail & Related papers (2024-09-11T03:09:55Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z) - RRAML: Reinforced Retrieval Augmented Machine Learning [10.94680155282906]
We propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML)
RRAML integrates the reasoning capabilities of large language models with supporting information retrieved by a purpose-built retriever from a vast user-provided database.
We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI.
arXiv Detail & Related papers (2023-07-24T13:51:19Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.