ParamMem: Augmenting Language Agents with Parametric Reflective Memory
- URL: http://arxiv.org/abs/2602.23320v2
- Date: Fri, 27 Feb 2026 08:21:31 GMT
- Title: ParamMem: Augmenting Language Agents with Parametric Reflective Memory
- Authors: Tianjun Yao, Yongqiang Chen, Yujia Zheng, Pan Li, Zhiqiang Shen, Kun Zhang,
- Abstract summary: Self-reflection enables language agents to iteratively refine solutions, yet often produces repetitive outputs that limit reasoning performance.<n>We introduce ParamMem, a parametric memory module that encodes cross-sample reflection patterns into model parameters.<n>We propose ParamAgent, a reflection-based agent framework that integrates parametric memory with episodic and cross-sample memory.
- Score: 50.28529749962535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-reflection enables language agents to iteratively refine solutions, yet often produces repetitive outputs that limit reasoning performance. Recent studies have attempted to address this limitation through various approaches, among which increasing reflective diversity has shown promise. Our empirical analysis reveals a strong positive correlation between reflective diversity and task success, further motivating the need for diverse reflection signals. We introduce ParamMem, a parametric memory module that encodes cross-sample reflection patterns into model parameters, enabling diverse reflection generation through temperature-controlled sampling. Building on this module, we propose ParamAgent, a reflection-based agent framework that integrates parametric memory with episodic and cross-sample memory. Extensive experiments on code generation, mathematical reasoning, and multi-hop question answering demonstrate consistent improvements over state-of-the-art baselines. Further analysis reveals that ParamMem is sample-efficient, enables weak-to-strong transfer across model scales, and supports self-improvement without reliance on stronger external model, highlighting the potential of ParamMem as an effective component for enhancing language agents.
Related papers
- Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation [11.819481846962447]
We investigate how agents built on pretrained large language models can learn target classification functions from labeled examples without parameter updates.<n>Our framework uses episodic memory to store instance-level critiques and distill these into reusable, task-level guidance.<n>Our findings highlight the promise of memory-driven, reflective learning for building more adaptive and interpretable LLM agents.
arXiv Detail & Related papers (2025-10-22T17:58:03Z) - Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting [92.57796055887995]
We introduce ECHO, a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents.<n> ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts.<n>We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation.
arXiv Detail & Related papers (2025-10-11T18:11:09Z) - SAMULE: Self-Learning Agents Enhanced by Multi-level Reflection [14.40651157974557]
SAMULE is a new framework for self-learning agents powered by a retrospective language model that is trained based on Multi-Level Reflection Synthesis.<n>It first synthesizes high-quality reflections across three complementary levels: Single-Trajectory Learning (micro-level) for detailed error correction; Intra-Task Learning (meso-level) to build error across multiple trials of the same task, and Inter-Task Learning (macro-level) to extract transferable insights based on same typed errors from diverse task failures.
arXiv Detail & Related papers (2025-09-24T21:02:15Z) - Meta-Policy Reflexion: Reusable Reflective Memory and Rule Admissibility for Resource-Efficient LLM Agent [6.300669721057781]
Meta-Policy Reflexion (MPR) is a framework that consolidates LLM-generated reflections into a structured, predicate-like Meta-Policy Memory (MPM)<n>MPR externalizes reusable corrective knowledge without model weight updates, enforces domain constraints to reduce unsafe or invalid actions, and retains the adaptability of language-based reflection.<n> Empirical results reported in the supplied material indicate consistent gains in execution accuracy and robustness when compared to Reflexion baselines; rule admissibility further improves stability.
arXiv Detail & Related papers (2025-09-04T08:18:39Z) - ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding [71.654781631463]
ReAgent-V is a novel agentic video understanding framework.<n>It integrates efficient frame selection with real-time reward generation during inference.<n>Extensive experiments on 12 datasets demonstrate significant gains in generalization and reasoning.
arXiv Detail & Related papers (2025-06-02T04:23:21Z) - ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [53.817538122688944]
We introduce Reinforced Meta-thinking Agents (ReMA) to elicit meta-thinking behaviors from Reasoning of Large Language Models (LLMs)<n>ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions.<n> Empirical results from single-turn experiments demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z) - Instruct-of-Reflection: Enhancing Large Language Models Iterative Reflection Capabilities via Dynamic-Meta Instruction [11.838351314880736]
Instruct-of-Reflection (IoRT) is a novel and general reflection framework that leverages dynamic-meta instruction to enhance the iterative reflection capability of Large Language Models (LLMs)<n>Our experiments demonstrate that IoRT achieves an average improvement of 10.1% over established baselines in mathematical and commonsense reasoning tasks.
arXiv Detail & Related papers (2025-03-02T14:02:03Z) - Meta-Reflection: A Feedback-Free Reflection Learning Framework [57.14485943991588]
We propose Meta-Reflection, a feedback-free reflection mechanism that requires only a single inference pass without external feedback.<n>Motivated by the human ability to remember and retrieve reflections from past experiences, Meta-Reflection integrates reflective insights into a codebook.<n>To thoroughly investigate and evaluate the practicality of Meta-Reflection in real-world scenarios, we introduce an industrial e-commerce benchmark named E-commerce Customer Intent Detection.
arXiv Detail & Related papers (2024-12-18T12:20:04Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z) - Reflexion: Language Agents with Verbal Reinforcement Learning [44.85337947858337]
Reflexion is a novel framework to reinforce language agents not by updating weights, but through linguistic feedback.
It is flexible enough to incorporate various types (scalar values or free-form language) and sources (external or internally simulated) of feedback signals.
For example, Reflexion achieves a 91% pass@1 accuracy on the HumanEval coding benchmark, surpassing the previous state-of-the-art GPT-4 that achieves 80%.
arXiv Detail & Related papers (2023-03-20T18:08:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.