Fugu-MT 論文翻訳(概要): Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models

論文の概要: Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models

arxiv url: http://arxiv.org/abs/2510.22752v1
Date: Sun, 26 Oct 2025 17:01:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-28 15:28:15.35525
Title: Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models
Title（参考訳）: セマンティックスを超えて: トランスフォーマーと状態空間モデルにおける時間的バイアスの形状検索
Authors: Anooshka Bajaj, Deven Mahesh Mistry, Sahaj Singh Maini, Yash Aggarwal, Zoran Tiganj,
Abstract要約: 文脈内学習は時間的関係と意味的関係の両方によって支配される。この研究は、様々な事前訓練された大規模言語モデル(LLM)が、時間的に分離されたイベントを識別し、検索する能力について調査する。本研究は、文脈内学習における時間的偏見の理解を深め、これらの偏見が時間的分離と韻律的検索をいかに可能かを示すものである。
参考スコア（独自算出の注目度）: 4.69761138328817
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In-context learning is governed by both temporal and semantic relationships, shaping how Large Language Models (LLMs) retrieve contextual information. Analogous to human episodic memory, where the retrieval of specific events is enabled by separating events that happened at different times, this work probes the ability of various pretrained LLMs, including transformer and state-space models, to differentiate and retrieve temporally separated events. Specifically, we prompted models with sequences containing multiple presentations of the same token, which reappears at the sequence end. By fixing the positions of these repeated tokens and permuting all others, we removed semantic confounds and isolated temporal effects on next-token prediction. Across diverse sequences, models consistently placed the highest probabilities on tokens following a repeated token, but with a notable bias for those nearest the beginning or end of the input. An ablation experiment linked this phenomenon in transformers to induction heads. Extending the analysis to unique semantic contexts with partial overlap further demonstrated that memories embedded in the middle of a prompt are retrieved less reliably. Despite architectural differences, state-space and transformer models showed comparable temporal biases. Our findings deepen the understanding of temporal biases in in-context learning and offer an illustration of how these biases can enable temporal separation and episodic retrieval.
Abstract（参考訳）: 文脈内学習は時間的・意味的な関係によって制御され、Large Language Models(LLM)が文脈情報を取得する方法を形成する。異なるタイミングで発生した事象を分離することで、特定の事象の検索が可能となるヒトのエピソード記憶と類似して、この研究は、トランスフォーマーや状態空間モデルを含む様々な事前訓練されたLSMが、時間的に分離された事象を区別し、検索する能力について調査する。具体的には、同じトークンの複数のプレゼンテーションを含むシーケンスを持つモデルを提案し、シーケンスの最後に再び現れる。これらの繰り返しトークンの位置を固定し、他の全てのトークンを置換することにより、次トーケン予測に対するセマンティック・コンファウンドと孤立した時間効果を除去した。さまざまなシーケンスにわたって、モデルは繰り返しトークンの後にトークンに最も高い確率を常に配置したが、入力の開始と終了に最も近いものには顕著なバイアスがあった。アブレーション実験は、この現象をトランスフォーマーで誘導ヘッドと結びつけた。分析を部分的な重複を伴うユニークな意味文脈に拡張することで、プロンプトの中央に埋め込まれた記憶が、より確実に検索されることが証明された。アーキテクチャ上の違いにもかかわらず、状態空間とトランスフォーマーモデルは、時間バイアスに匹敵する傾向を示した。本研究は、文脈内学習における時間的偏見の理解を深め、これらの偏見が時間的分離と韻律的検索をいかに可能かを示すものである。

論文の概要: Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models

関連論文リスト