From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory
- URL: http://arxiv.org/abs/2511.07800v1
- Date: Wed, 12 Nov 2025 01:19:13 GMT
- Title: From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory
- Authors: Siyu Xia, Zekun Xu, Jiajun Chai, Wentian Fan, Yan Song, Xiaohan Wang, Guojun Yin, Wei Lin, Haifeng Zhang, Jun Wang,
- Abstract summary: Large Language Models (LLMs) based agents have demonstrated remarkable potential in autonomous task-solving.<n>In this paper, we introduce a novel agent-centric, trainable, multi-layered graph memory framework.<n>We show how context memory enhances the ability of LLMs to utilize information.
- Score: 48.22750809620306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) based agents have demonstrated remarkable potential in autonomous task-solving across complex, open-ended environments. A promising approach for improving the reasoning capabilities of LLM agents is to better utilize prior experiences in guiding current decisions. However, LLMs acquire experience either through implicit memory via training, which suffers from catastrophic forgetting and limited interpretability, or explicit memory via prompting, which lacks adaptability. In this paper, we introduce a novel agent-centric, trainable, multi-layered graph memory framework and evaluate how context memory enhances the ability of LLMs to utilize parametric information. The graph abstracts raw agent trajectories into structured decision paths in a state machine and further distills them into high-level, human-interpretable strategic meta-cognition. In order to make memory adaptable, we propose a reinforcement-based weight optimization procedure that estimates the empirical utility of each meta-cognition based on reward feedback from downstream tasks. These optimized strategies are then dynamically integrated into the LLM agent's training loop through meta-cognitive prompting. Empirically, the learnable graph memory delivers robust generalization, improves LLM agents' strategic reasoning performance, and provides consistent benefits during Reinforcement Learning (RL) training.
Related papers
- MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation [11.222602737031101]
We propose MAGE, a meta-RL framework that empowers LLM agents for strategic exploration and exploitation.<n>MAGE utilizes a multi-episode training regime where interaction histories and reflections are integrated into the context window.<n>Experiment results show that MAGE outperforms existing baselines in both exploration and exploitation tasks.
arXiv Detail & Related papers (2026-03-04T03:14:37Z) - Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning [18.215893951726166]
In environments with sparse or delayed rewards, reinforcement learning incurs high sample complexity.<n>This limitation has motivated the use of large language models (LLMs) for subgoal discovery and trajectory guidance.<n>We address these challenges by constructing a memory graph that encodes subgoals and trajectories from both LLM guidance and the agent's own successful rollouts.
arXiv Detail & Related papers (2026-02-20T01:44:35Z) - MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance [18.215893951726166]
Large language models (LLMs) can provide subgoal decompositions, plausible trajectories, and abstract priors that facilitate early learning.<n>We propose MIRA (Memory-Integrated Reinforcement Learning Agent), which incorporates a structured, evolving memory graph to guide early training.
arXiv Detail & Related papers (2026-02-20T01:43:30Z) - rSIM: Incentivizing Reasoning Capabilities of LLMs via Reinforced Strategy Injection [49.74493901036598]
Large language models (LLMs) are post-trained through reinforcement learning (RL) to evolve into Reasoning Language Models (RLMs)<n>This paper proposes a novel reinforced strategy injection mechanism (rSIM) that enables any LLM to become an RLM by employing a small planner.<n> Experimental results show that rSIM enables Qwen2.5-0.5B to become an RLM and significantly outperform Qwen2.5-14B.
arXiv Detail & Related papers (2025-12-09T06:55:39Z) - Learn to Memorize: Optimizing LLM-based Agents with Adaptive Memory Framework [33.739298910759544]
We propose to optimize LLM-based agents with an adaptive and data-driven memory framework by modeling memory cycles.<n>Specifically, we design an MoE gate function to facilitate memory retrieval, propose a learnable aggregation process to improve memory utilization, and develop task-specific reflection to adapt memory storage.
arXiv Detail & Related papers (2025-08-15T12:22:52Z) - Feedback-Induced Performance Decline in LLM-Based Decision-Making [6.5990946334144756]
Large Language Models (LLMs) can extract context from natural language problem descriptions.<n>This paper studies the behaviour of these models within a Markov Decision Process (MDPs)
arXiv Detail & Related papers (2025-07-20T10:38:56Z) - ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [53.817538122688944]
We introduce Reinforced Meta-thinking Agents (ReMA) to elicit meta-thinking behaviors from Reasoning of Large Language Models (LLMs)<n>ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions.<n> Empirical results from single-turn experiments demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [18.369668601864575]
AriGraph is a memory graph that integrates semantic and episodic memories while exploring the environment.<n>We demonstrate that our Ariadne LLM agent effectively handles complex tasks within interactive text game environments difficult even for human players.
arXiv Detail & Related papers (2024-07-05T09:06:47Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)<n>We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.