Fugu-MT 論文翻訳(概要): EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

論文の概要: EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

arxiv url: http://arxiv.org/abs/2606.06906v1
Date: Fri, 05 Jun 2026 04:49:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.571379
Title: EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering
Title（参考訳）: EASE-TTT: 長期質問応答のためのエビデンスアライン選択テストタイムトレーニング
Authors: Xiaopeng Yuan, Zebin Wang, Suwen Wang, Zongxin Yang, Haohan Wang, Yushun Dong,
Abstract要約: 長文質問応答のためのEvidence-Aligned Selective Test-Time Training (EASE-TTT)を提案する。 EASE-TTTは、選択されたエビデンスチャンクをトークン位置に対するソフトアテンション監視ターゲットに変換する。完全コンテキスト推論、検索専用ベースライン、qTTTの中では最強のマクロ平均性能を実現している。
参考スコア（独自算出の注目度）: 61.89411578705886
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Long-context question answering (QA) remains challenging for smaller language models even when answer-bearing evidence is already present in the input. Existing within-context retrieval methods localize and expose candidate evidence chunks for the question, but they stop at input-level evidence exposure rather than adapting the query-side attention parameters that control how the model allocates attention over full-context positions. In contrast, lightweight test-time adaptation methods, such as query-only test-time training (qTTT), leave evidence localization unresolved because their generic span-level self-supervised objectives do not identify which context positions support the current answer. In this paper, we propose Evidence-Aligned SElective Test-Time Training (EASE-TTT), a within-context retrieval-augmented test-time training framework that converts selected evidence chunks into a soft attention supervision target over their token positions. Instead of replacing the full context with retrieved chunks, EASE-TTT uses the resulting attention target to guide query-side adaptation, with the adapted model generating the final answer from the original full context. Experiments on six LongBench QA tasks and three small decoder-only language models show that EASE-TTT achieves the strongest macro-average performance among full-context inference, retrieval-only baselines, and qTTT, supporting evidence-aligned test-time adaptation in long-context QA.
Abstract（参考訳）: 長文質問応答 (Long-context Question answering, QA) は, 入力にすでに答えを持つ証拠が存在する場合でも, より小さな言語モデルでは依然として困難である。既存のコンテキスト内検索手法では、候補の証拠チャンクをローカライズして公開するが、完全なコンテキスト位置よりもどのように注意を割り当てるかを制御するクエリ側注意パラメータを適応するのではなく、入力レベルのエビデンス露光で停止する。対照的に、クエリオンリーのテストタイムトレーニング(qTTT)のような軽量なテストタイム適応手法では、汎用的なスパンレベルの自己管理目的が現在の回答をサポートするコンテキストの位置を特定しないため、エビデンスローカライゼーションは未解決のままである。本稿では,選択されたエビデンスチャンクをトークン位置上のソフトアテンション監視ターゲットに変換する,テキスト内検索強化テストタイムトレーニングフレームワークであるEvidence-Aligned Selective Test-Time Training (EASE-TTT)を提案する。完全なコンテキストを取得したチャンクに置き換える代わりに、EASE-TTTはクエリ側の適応をガイドするために、結果のアテンションターゲットを使用し、適応モデルは元のフルコンテキストから最終回答を生成する。 6つのLongBench QAタスクと3つの小さなデコーダのみの言語モデルの実験により、EASE-TTTは、完全コンテキスト推論、検索専用ベースライン、およびqTTTの中で最強のマクロ平均性能を達成し、長期コンテキストQAにおけるエビデンス整合テスト時間適応をサポートすることが示された。

論文の概要: EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

関連論文リスト