Related papers: MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models

MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models

URL: http://arxiv.org/abs/2408.03841v1
Date: Wed, 7 Aug 2024 15:27:22 GMT
Title: MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models
Authors: Yuchen Dong, XiaoXiang Fang, Yuchen Hu, Renshuang Jiang, Zhe Jiang,
Abstract summary: This paper addresses the importance of converting real-time task experiences into system memory. We show that the accumulation and recycling of task memories lead to a steady enhancement in task success rate. The inclusion of memory recycling can also boost the system's task execution efficiency by up to 25%.
Score: 13.839564855350295
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The application of large language models to facilitate automated software operations and tool generation (SOTG), thus augmenting software productivity, mirrors the early stages of human evolution when the ability to create and use tools accelerated the progress of civilization. These complex tasks require AI to continuously summarize and improve. Current research often overlooks the importance of converting real-time task experiences into system memory and differentiating the value of existing knowledge for future reference. This paper addresses these issues by evolving external memory models into Memory-Loop Networks for timely memorization and experience referencing. We also enhance a RAG mechanism with knowledge precision segmentation to utilize memory based on value differentiation, and design the MaxMind model for SOTG accordingly.To demonstrate our approach, we developed MaxMind4Sheet, an electronic spreadsheet processing system aligned with the MaxMind philosophy. Comparative experiments with SheetCopilot have demonstrated that the accumulation and recycling of task memories lead to a steady enhancement in task success rate, with an improvement rate of approximately 3%-6% per round in this implementation example. Note that as the memories continue to grow, this cumulative improvement may be substantial. The inclusion of memory recycling can also boost the system's task execution efficiency by up to 25%, and it can address the retraining issue faced by LLMs when handling specialized tasks through memories transfer.These suggest that MaxMind has significant potential to enhance the capabilities and productivity of LLM systems in SOTG.

Related papers

PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing [48.30406812516552]
We introduce the PLM, a Peripheral Language Model, developed through a co-design process that jointly optimize model architecture and edge system constraints. PLM employs a Multi-head Latent Attention mechanism and employs the squared ReLU activation function to encourage sparsity, thereby reducing peak memory footprint. evaluation results demonstrate that PLM outperforms existing small language models trained on publicly available data.
arXiv Detail & Related papers (2025-03-15T15:11:17Z)
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [6.380729797938521]
retrieval-augmented generation (RAG) has become the dominant way to introduce new information. Recent RAG approaches augment vector embeddings with various structures like knowledge graphs to address some gaps, namely sense-making and associativity. We propose HippoRAG 2, a framework that outperforms standard RAG comprehensively on factual, sense-making, and associative memory tasks.
arXiv Detail & Related papers (2025-02-20T18:26:02Z)
Improving Factuality with Explicit Working Memory [68.39261790277615]
Large language models can generate factually inaccurate content, a problem known as hallucination. We introduce EWE (Explicit Working Memory), a novel approach that enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
arXiv Detail & Related papers (2024-12-24T00:55:59Z)
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models [22.93893181000535]
Large Language Models (LLMs) need to adapt to the continuous changes in data, tasks, and user preferences. To address these challenges, this paper proposes the Compression Memory Training (CMT) method. CMT compresses and extracts information from new documents to be stored in a memory bank. When answering to queries related to these new documents, the model aggregates these document memories from the memory bank to better answer user questions.
arXiv Detail & Related papers (2024-12-10T10:35:19Z)
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution [114.61347672265076]
Development of MLLMs for real-world robots is challenging due to the typically limited computation and memory capacities available on robotic platforms. We propose a Dynamic Early-Exit Framework for Robotic Vision-Language-Action Model (DeeR) that automatically adjusts the size of the activated MLLM. DeeR demonstrates significant reductions in computational costs of LLM by 5.2-6.5x and GPU memory of LLM by 2-6x without compromising performance.
arXiv Detail & Related papers (2024-11-04T18:26:08Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures [26.183960625493807]
Large Language Models (LLMs) deployed on edge devices learn through fine-tuning and updating a certain portion of their parameters. Retrieval-Augmented Generation (RAG) is a resource-efficient LLM learning method. We propose a novel framework to accelerate RAG via Computing-in-Memory (CiM) architectures.
arXiv Detail & Related papers (2024-05-07T22:31:50Z)
Bullion: A Column Store for Machine Learning [4.096087402737292]
This paper presents Bullion, a columnar storage system tailored for machine learning workloads. Bundy addresses the complexities of data compliance, optimize the encoding of long sequence sparse features, efficiently manages wide-table projections, introduces feature quantization in storage, and provides a comprehensive cascading encoding framework. Preliminary experimental results and theoretical analysis demonstrate Bullion's improved ability to deliver strong performance in the face of the unique demands of machine learning workloads.
arXiv Detail & Related papers (2024-04-13T05:01:54Z)
Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models. We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z)
RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models [3.9770715318303353]
RecallM is a novel architecture for providing Large Language Models with an adaptable and updatable long-term memory mechanism. We show that RecallM is four times more effective than using a vector database for updating knowledge previously stored in long-term memory. We also demonstrate that RecallM shows competitive performance on general question-answering and in-context learning tasks.
arXiv Detail & Related papers (2023-07-06T02:51:54Z)
Think Before You Act: Decision Transformers with Working Memory [44.18926449252084]
Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. We propose a working memory module to store, blend, and retrieve information for different downstream tasks.
arXiv Detail & Related papers (2023-05-24T01:20:22Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers. Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training. Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z)
Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System [3.202860612193139]
We propose an artificially intelligent memory mapping scheme, AIMM, that optimize data placement and resource utilization through page and computation remapping. AIMM uses a neural network to achieve a near-optimal mapping during execution, trained using a reinforcement learning algorithm. Our experimental evaluation shows that AIMM improves the baseline NMP performance in single and multiple program scenario by up to 70% and 50%, respectively.
arXiv Detail & Related papers (2021-04-28T09:50:35Z)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER) SEER is a simple modification of existing off-policy deep reinforcement learning methods. We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.