Fugu-MT 論文翻訳(概要): Mem-α: Learning Memory Construction via Reinforcement Learning

論文の概要: Mem-α: Learning Memory Construction via Reinforcement Learning

arxiv url: http://arxiv.org/abs/2509.25911v1
Date: Tue, 30 Sep 2025 08:02:34 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-01 14:45:00.060289
Title: Mem-α: Learning Memory Construction via Reinforcement Learning
Title（参考訳）: Mem-α:強化学習による記憶構築学習
Authors: Yu Wang, Ryuichi Takanobu, Zhiqi Liang, Yuzhen Mao, Yuanzhe Hu, Julian McAuley, Xiaojian Wu,
Abstract要約: 大きな言語モデル(LLM)エージェントは、限られたコンテキストウィンドウによって制約される。現在のメモリ拡張エージェントは、メモリ更新のための事前に定義された命令とツールに依存している。 Mem-alphaは、エージェントに複雑なメモリシステムを効果的に管理するように訓練する強化学習フレームワークである。
参考スコア（独自算出の注目度）: 20.916677456417464
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) agents are constrained by limited context windows, necessitating external memory systems for long-term information understanding. Current memory-augmented agents typically depend on pre-defined instructions and tools for memory updates. However, language models may lack the ability to determine which information to store, how to structure it, and when to update it, especially as memory systems become more complex. This results in suboptimal memory construction and information loss. To this end, we propose Mem-alpha, a reinforcement learning framework that trains agents to effectively manage complex memory systems through interaction and feedback. We also construct a specialized training dataset spanning diverse multi-turn interaction patterns paired with comprehensive evaluation questions designed to teach effective memory management. During training, agents process sequential information chunks, learn to extract and store relevant content, then update the memory system. The reward signal derives from downstream question-answering accuracy over the full interaction history, directly optimizing for memory construction. To illustrate the effectiveness of our training framework, we design a memory architecture comprising core, episodic, and semantic components, equipped with multiple tools for memory operations. Empirical evaluation demonstrates that Mem-alpha achieves significant improvements over existing memory-augmented agent baselines. Despite being trained exclusively on instances with a maximum length of 30k tokens, our agents exhibit remarkable generalization to sequences exceeding 400k tokens, over 13x the training length, highlighting the robustness of Mem-alpha.
Abstract（参考訳）: 大規模言語モデル(LLM)エージェントは、長期情報理解のために外部メモリシステムを必要とする限られたコンテキストウィンドウによって制約される。現在のメモリ拡張エージェントは、通常、メモリ更新のための事前に定義された命令とツールに依存する。しかし、言語モデルには、どの情報を格納するか、どのように構成するか、いつ更新するかを判断する能力がないかもしれない。これにより、最適なメモリ構築と情報損失が発生する。この目的のために,エージェントがインタラクションやフィードバックを通じて複雑なメモリシステムを管理することを効果的に訓練する強化学習フレームワークであるMem-alphaを提案する。また,効率的なメモリ管理の指導を目的とした総合的な評価質問と組み合わせて,多様なマルチターンインタラクションパターンにまたがる特別なトレーニングデータセットを構築した。トレーニング中、エージェントはシーケンシャルな情報チャンクを処理し、関連するコンテンツを抽出して保存し、メモリシステムを更新します。報酬信号は、完全な相互作用履歴に対する下流の質問応答精度から導出され、直接メモリ構成に最適化される。トレーニングフレームワークの有効性を説明するため,コア,エピソード,セマンティックコンポーネントからなるメモリアーキテクチャを設計し,複数のメモリ操作ツールを備える。経験的評価では、Mem-alphaは既存のメモリ拡張エージェントベースラインよりも大幅に改善されている。最大長が30kのインスタンスに特化してトレーニングされているにもかかわらず、我々のエージェントは400kのトークンを超えるシーケンスに顕著な一般化を示し、トレーニング長の13倍を超え、Mem-alphaの堅牢性を強調した。

論文の概要: Mem-α: Learning Memory Construction via Reinforcement Learning

関連論文リスト