Fugu-MT 論文翻訳(概要): Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks

論文の概要: Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks

arxiv url: http://arxiv.org/abs/2605.04217v1
Date: Tue, 05 May 2026 18:59:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-07 18:41:07.496745
Title: Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks
Title（参考訳）: Jordan-RoPE: 複素ジョルダンブロックによる非半単純相対的位置符号化
Authors: Yaobo Zhang,
Abstract要約: 相対的な位置エンコーディングは、クエリキーラグのどの関数がプリミティブアテンションロジットに入るかを決定する。複素回転固有値とnilpotent応答が同じ欠陥ヨルダンブロックに存在する非半単純ケースについて検討する。構成は、単にRoPEに別の距離チャネルを追加するのではなく、距離変調された位相基底$d eid$を実現する。
参考スコア（独自算出の注目度）: 0.36260136172126667
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Relative positional encodings determine which functions of query-key lag can enter the primitive attention logit. RoPE supplies a rotary phase, while ALiBi supplies an additive distance bias. Motivated by group-theoretic views of linear translation-invariant positional encodings, we study a non-semisimple case in which a complex rotary eigenvalue and a nilpotent response live in the same defective Jordan block. The resulting relative operator generates oscillatory-polynomial features such as $e^{-γd}\cos(ωd)$, $e^{-γd}\sin(ωd)$, $d e^{-γd}\cos(ωd)$, and $d e^{-γd}\sin(ωd)$, for causal lag $d=i-j\geq 0$. Thus the construction realizes a distance-modulated phase basis $d e^{iωd}$, rather than merely adding a separate distance channel to RoPE. We formulate Exact Jordan-RoPE as a non-semisimple one-parameter representation, give its real block form, and specify the contragredient query action required by non-orthogonal positional maps. We also distinguish this exact representation from stabilized variants whose bounded shear improves numerical behavior but breaks the exact group law. Kernel-level diagnostics and a Jordan-friendly synthetic language-model task show that the coupled Jordan basis is useful when the target contains distance-modulated phase interactions. On a small WikiText-103 byte language model, a scaled-exact variant improves over RoPE and direct-sum baselines within the Jordan family, while RoPE+ALiBi remains strongest overall. The evidence is structural rather than a broad performance claim.
Abstract（参考訳）: 相対的な位置エンコーディングは、クエリキーラグのどの関数がプリミティブアテンションロジットに入るかを決定する。 RoPEは回転相を、ALiBiは添加性距離バイアスを供給している。線形変換不変な位置エンコーディングの群論的視点により、複素回転固有値とnilpotent応答が同じ欠陥ジョーダンブロックに存在する非半単純ケースについて検討する。結果として生じる相対作用素は、例えば $e^{-γd}\cos(ωd)$, $e^{-γd}\sin(ωd)$, $d e^{-γd}\cos(ωd)$, $d e^{-γd}\sin(ωd)$, for causal lag $d=i-j\geq 0$ のような振動多項式的特徴を生成する。したがって、構成は、単にRoPEに別の距離チャネルを追加するのではなく、距離変調位相基底$d e^{iωd}$を実現する。非半単純1パラメータ表現としてExact Jordan-RoPEを定式化し、その実ブロック形式を与え、非直交位置写像で要求される不規則なクエリアクションを指定する。また、この正確な表現は、有界なせん断が数値的な振舞いを改善するが、正確な群法則を破る安定な変種と区別する。カーネルレベルの診断とヨルダンフレンドリーな合成言語モデルタスクは、ターゲットが距離変調相の相互作用を含む場合、結合ジョルダン基底が有用であることを示す。小さな WikiText-103 バイト言語モデルでは、スケールしたexact 変種が、Jordan ファミリー内の RoPE とdirect-sum ベースラインよりも改善され、RoPE+ALiBi は全体として最強である。証拠は広範な性能主張というよりも構造的なものである。

論文の概要: Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks

関連論文リスト