Fugu-MT 論文翻訳(概要): A Parametric Memory Head for Continual Generative Retrieval

論文の概要: A Parametric Memory Head for Continual Generative Retrieval

arxiv url: http://arxiv.org/abs/2604.23388v1
Date: Sat, 25 Apr 2026 17:38:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.311463
Title: A Parametric Memory Head for Continual Generative Retrieval
Title（参考訳）: 連続生成検索のためのパラメトリックメモリヘッド
Authors: Kidist Amde Mekonnen, Yubao Tang, Maarten de Rijke,
Abstract要約: 生成情報検索(GenIR)は、検索を単一のニューラルモデルに統合し、クエリから直接ドキュメント識別子(ドシデント)をデコードする。逐次適応は、新たに追加された文書の検索を改善するが、以前のスライスの性能は著しく低下することを示す。本稿では,モジュール型パラメトリックメモリヘッドで適応モデルを拡張するメモリのみの安定化ステージである,後適応メモリチューニング(PAMT)を提案する。
参考スコア（独自算出の注目度）: 52.66674234249913
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative information retrieval (GenIR) consolidates retrieval into a single neural model that decodes document identifiers (docids) directly from queries. While this model-as-index paradigm offers architectural simplicity, it is poorly suited to dynamic document collections. Unlike modular systems, where indexes are easily updated, GenIR's knowledge is parametrically encoded in its weights; consequently, standard adaptation methods such as full and parameter-efficient fine-tuning can induce catastrophic forgetting. We show that sequential adaptation improves retrieval on newly added documents but substantially degrades performance on earlier slices, exposing a pronounced stability-plasticity trade-off. To address this, we propose post-adaptation memory tuning (PAMT), a memory-only stabilization stage that augments an adapted model with a modular parametric memory head (PMH). PAMT freezes the backbone and attaches a product-key memory with fixed addressing. During prefix-trie constrained decoding, decoder hidden states sparsely query PMH to produce residual corrections in hidden space; these corrections are mapped to score adjustments via the frozen output embedding matrix, computed only over trie-valid tokens. This guides docid generation while keeping routing and backbone parameters fixed. To limit cross-slice interference, PAMT updates only a fixed budget of memory values selected using decoding-time access statistics, prioritizing entries frequently activated by the current slice and rarely used in prior sessions. Experiments on MS MARCO and Natural Questions under sequential, disjoint corpus increments show that PAMT substantially improves retention on earlier slices with minimal impact on retrieval performance for newly added documents, while modifying only a sparse subset of memory values per session.
Abstract（参考訳）: 生成情報検索(GenIR)は、検索を単一のニューラルモデルに統合し、クエリから直接ドキュメント識別子(ドシデント)をデコードする。このモデル・アズ・インデックスのパラダイムはアーキテクチャの単純さを提供するが、動的なドキュメントコレクションには適していない。インデックスが容易に更新されるモジュラーシステムとは異なり、GenIRの知識はその重みでパラメトリックに符号化されている。逐次適応は、新たに追加された文書の検索を改善するが、以前のスライスの性能は大幅に低下し、安定性と塑性のトレードオフが明らかになることを示す。これを解決するために,モジュール型パラメトリックメモリヘッド(PMH)を用いた適応モデルを強化するメモリのみの安定化ステージである,後適応メモリチューニング(PAMT)を提案する。 PAMTはバックボーンを凍結し、固定アドレスで製品キーメモリをアタッチする。プレフィックス・トリー制約デコード中、デコーダの隠蔽状態はPMHをスパースクエリして隠れ空間の残差補正を生成する。これはルーティングとバックボーンパラメータの固定を維持しながら、ドシデント生成をガイドする。クロススライス干渉を制限するため、PAMTはデコード時間アクセス統計を用いて選択されたメモリ値の固定予算のみを更新し、現在のスライスによって頻繁に起動されるエントリを優先順位付けする。 MS MARCOとNatural Questionsの逐次的不連続コーパスインクリメントによる実験により,PAMTは,セッション毎のメモリ値の少ないサブセットだけを変更しつつ,新たに追加された文書の検索性能に最小限の影響を伴って,以前のスライスにおける保持性を大幅に向上することが示された。

論文の概要: A Parametric Memory Head for Continual Generative Retrieval

関連論文リスト