Fugu-MT 論文翻訳(概要): MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

論文の概要: MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

arxiv url: http://arxiv.org/abs/2605.28732v1
Date: Wed, 27 May 2026 16:53:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-28 17:38:56.229301
Title: MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
Title（参考訳）: MemTrace:大規模言語モデルメモリシステムにおけるエラーの追跡と帰属
Authors: Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu, Yanzhe Wu, Guang Li, Buqiang Xu, Yunzhi Yao, Jizhan Fang, Haoliang Cao, Junjie Guo, Yuan Yuan, Ziqing Ma, Yuanqiang Yu, Rui Hu, Baohua Dong, Hangcheng Zhu, Ningyu Zhang,
Abstract要約: LLMメモリシステムにおける誤り追跡と帰属の新たな問題について検討する。本稿では,メモリパイプラインを実行可能なメモリ進化グラフに変換する新しいフレームワークを提案する。次に、Long-Context、RAG、Mem0、EverMemOSといったメモリシステムから収集されたベンチマークであるMemTraceBenchを構築します。
参考スコア（独自算出の注目度）: 24.22940060094778
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memory is essential for enabling large language models to support long-horizon reasoning, yet existing memory systems remain unreliable and difficult to debug. Tracing memory's dynamic evolution is crucial to understand how information is synthesized, propagated, or corrupted over time. In this work, we study the new problem of error tracing and attribution in LLM memory systems. We propose a novel framework that transforms memory pipelines into executable memory evolution graphs, enabling fine-grained tracing of operational information flow. We then construct MemTraceBench, a benchmark collected from representative memory systems such as Long-Context, RAG, Mem0, and EverMemOS, to systematically study memory failure modes. We further introduce an automatic attribution method that iteratively traces operation subgraphs to pinpoint the root cause of any failed case. Our analysis reveals that memory failures are systematic, stemming from operation-level issues like information loss and retrieval misalignment. Crucially, we leverage these fine-grained attribution signals to guide downstream prompt optimization, establishing a closed-loop system that automatically corrects faults and boosts end-task performance by up to 7.62%. Code will be released at https://github.com/zjunlp/MemTrace.
Abstract（参考訳）: 大規模な言語モデルで長期の推論をサポートするためにはメモリが不可欠だが、既存のメモリシステムは信頼性が低く、デバッグも困難である。追跡メモリの動的進化は、情報が時間とともにどのように合成され、伝播され、あるいは破壊されるかを理解するために重要である。本研究では,LLMメモリシステムにおけるエラートレースと属性の新たな問題について検討する。本稿では,メモリパイプラインを実行可能なメモリ進化グラフに変換する新しいフレームワークを提案する。次に、Long-Context、RAG、Mem0、EverMemOSなどの代表的なメモリシステムから収集されたベンチマークであるMemTraceBenchを構築し、メモリ障害モードを体系的に研究する。さらに, 故障事例の根本原因を特定するために, 操作部分グラフを反復的にトレースする自動帰属法を導入する。解析の結果,記憶障害は情報損失や検索ミスアライメントといった操作レベルの問題に起因していることが明らかとなった。重要なことは、これらの微粒な属性信号を利用して、下流のプロンプト最適化を誘導し、欠陥を自動的に修正し、エンドタスク性能を最大7.62%向上させるクローズドループシステムを確立することである。コードはhttps://github.com/zjunlp/MemTrace.comでリリースされる。

論文の概要: MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

関連論文リスト