Fugu-MT 論文翻訳(概要): Behind RoPE: How Does Causal Mask Encode Positional Information?

論文の概要: Behind RoPE: How Does Causal Mask Encode Positional Information?

arxiv url: http://arxiv.org/abs/2509.21042v1
Date: Thu, 25 Sep 2025 11:48:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-26 20:58:12.876629
Title: Behind RoPE: How Does Causal Mask Encode Positional Information?
Title（参考訳）: RoPEの裏側: 因果マスクはどうやって位置情報をエンコードするのか?
Authors: Junu Kim, Xiao Liu, Zhenghao Lin, Lei Ji, Yeyun Gong, Edward Choi,
Abstract要約: 因果マスクは注意点における位置依存パターンを誘導できることを示す。因果マスクと RoPE の相互作用は,RoPE の相対的注意点パターンを非相対的に歪曲する。
参考スコア（独自算出の注目度）: 40.16173166190937
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: While explicit positional encodings such as RoPE are a primary source of positional information in Transformer decoders, the causal mask also provides positional information. In this work, we prove that the causal mask can induce position-dependent patterns in attention scores, even without parameters or causal dependency in the input. Our theoretical analysis indicates that the induced attention pattern tends to favor nearby query-key pairs, mirroring the behavior of common positional encodings. Empirical analysis confirms that trained models exhibit the same behavior, with learned parameters further amplifying these patterns. Notably, we found that the interaction of causal mask and RoPE distorts RoPE's relative attention score patterns into non-relative ones. We consistently observed this effect in modern large language models, suggesting the importance of considering the causal mask as a source of positional information alongside explicit positional encodings.
Abstract（参考訳）: RoPEのような明示的な位置符号化はトランスフォーマーデコーダにおける位置情報の一次源であるが、因果マスクは位置情報も提供する。本研究では,入力のパラメータや因果依存性がなくても,因果マスクが注目点における位置依存パターンを誘導できることを実証する。理論的解析により, 注目パターンは近傍の問合せキーペアを好む傾向があり, 共通位置符号化の挙動を反映していることが示された。経験的分析により、トレーニングされたモデルが同じ振る舞いを示し、学習されたパラメータがこれらのパターンをさらに増幅することを確認した。特に、因果マスクと RoPE の相互作用は、RoPE の相対的注意点パターンを非相対的パターンに歪ませることが判明した。現代の大規模言語モデルでは、この効果を一貫して観察し、位置情報のソースとして因果マスクを明示的な位置エンコーディングと併用することの重要性を示唆した。

論文の概要: Behind RoPE: How Does Causal Mask Encode Positional Information?

関連論文リスト