Memory-Driven Self-Improvement for Decision Making with Large Language Models
- URL: http://arxiv.org/abs/2509.26340v1
- Date: Tue, 30 Sep 2025 14:46:06 GMT
- Title: Memory-Driven Self-Improvement for Decision Making with Large Language Models
- Authors: Xue Yan, Zijing Ou, Mengyue Yang, Yan Song, Haifeng Zhang, Yingzhen Li, Jun Wang,
- Abstract summary: Large language models (LLMs) have emerged as effective action policies for sequential decision-making tasks.<n>We propose a memory-driven self-improvement framework that combines LLM general prior knowledge with a compact memory of domain-specific experiences.
- Score: 26.996248662693997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have emerged as effective action policies for sequential decision-making (SDM) tasks due to their extensive prior knowledge. However, this broad yet general knowledge is often insufficient for specific decision-making tasks with limited task-related data, making it challenging to efficiently adapt LLMs to specific SDM tasks. To address this challenge, we propose a memory-driven self-improvement framework that combines LLM general prior knowledge with a compact memory of domain-specific experiences. Memory retains past interactions and associated Q-values, thereby capturing decision-relevant knowledge that facilitates accurate value estimation and informs the LLM prior refinement. The refined LLM prior, in turn, generates higher-reward trajectories that further enrich memory, forming a natural self-improvement framework where memory and LLM prior mutually reinforce each other. Experiments show that our memory-driven approach significantly outperforms both traditional RL and LLM-based baselines, e.g., improving performance by over 40\% on in-distribution tasks and over 75\% when generalized to unseen tasks in ALFWorld.
Related papers
- MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning [78.46301394559903]
Large Language Models (LLMs) are increasingly used for long-duration tasks.<n>Current methods face a trade-off between cost and accuracy.<n>MemSifter is a novel framework that offloads the memory retrieval process to a small-scale proxy model.
arXiv Detail & Related papers (2026-03-03T02:57:38Z) - Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z) - Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory [89.65731902036669]
Evo-Memory is a streaming benchmark and framework for evaluating self-evolving memory in large language model (LLM) agents.<n>We evaluate over ten representative memory modules and evaluate them across 10 diverse multi-turn goal-oriented and single-turn reasoning and QA datasets.
arXiv Detail & Related papers (2025-11-25T21:08:07Z) - Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning [59.16831804985279]
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless.<n>Recent efforts to address this limitation often augment LLMs with an external memory bank, yet most existing pipelines are static and learned.<n>We present Memory-R1, a reinforcement learning framework that equips LLMs with the ability to actively manage and utilize external memory.
arXiv Detail & Related papers (2025-08-27T12:26:55Z) - Coarse-to-Fine Grounded Memory for LLM Agent Planning [29.3251327624294]
We propose a novel framework that grounds coarse-to-fine memories with Large Language Models.<n>Ours grounds environmental information into coarse-grained focus points to guide experience collection in training tasks.<n>At inference, Ours retrieves task-relevant experiences and tips to support planning.
arXiv Detail & Related papers (2025-08-21T06:50:23Z) - Learn to Memorize: Optimizing LLM-based Agents with Adaptive Memory Framework [33.739298910759544]
We propose to optimize LLM-based agents with an adaptive and data-driven memory framework by modeling memory cycles.<n>Specifically, we design an MoE gate function to facilitate memory retrieval, propose a learnable aggregation process to improve memory utilization, and develop task-specific reflection to adapt memory storage.
arXiv Detail & Related papers (2025-08-15T12:22:52Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models [69.99274367773997]
Large Language Models (LLMs) often struggle with dynamically changing knowledge and handling unknown static information.<n>Retrieval-Augmented Generation (RAG) is employed to tackle these challenges and has a significant impact on improving LLM performance.<n>We propose a Knowledge Boundary Model (KBM) to express the known/unknown of a given question, and to determine whether a RAG needs to be triggered.
arXiv Detail & Related papers (2024-11-09T15:12:28Z) - Efficient Reinforcement Learning with Large Language Model Priors [18.72288751305885]
Large language models (LLMs) have recently emerged as powerful general-purpose tools.
We propose treating LLMs as prior action distributions and integrating them into RL frameworks.
We show that incorporating LLM-based action priors significantly reduces exploration and complexity optimization.
arXiv Detail & Related papers (2024-10-10T13:54:11Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing large language models (LLMs) by integrating a structured and explicit read-and-write memory module.<n>Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination [20.269899169364397]
Large Language Models (LLMs) have exhibited remarkable proficiency in comprehending and generating natural language.
We propose a novel computational bionic memory mechanism, equipped with a parameter-efficient fine-tuning (PEFT) schema, to personalize medical assistants.
arXiv Detail & Related papers (2023-09-21T00:34:33Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.