Fugu-MT 論文翻訳(概要): Softmax Linear Attention: Reclaiming Global Competition

論文の概要: Softmax Linear Attention: Reclaiming Global Competition

arxiv url: http://arxiv.org/abs/2602.01744v1
Date: Mon, 02 Feb 2026 07:25:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-03 19:28:33.976152
Title: Softmax Linear Attention: Reclaiming Global Competition
Title（参考訳）: Softmax Linear Attention: グローバルコンペティションの復活
Authors: Mingwei Xu, Xuan Lin, Xinnan Guo, Wanqing Xu, Wanyun Cui,
Abstract要約: 効率を犠牲にすることなく競合選択を回復するフレームワークであるtextbfSoftmax Linear Attention (SLA) を提案する。実験では、SLAは言語モデリングと長期コンテキストベンチマークをまたいだ最先端の線形ベースラインを一貫して強化することを示した。
参考スコア（独自算出の注目度）: 28.81301173774774
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While linear attention reduces the quadratic complexity of standard Transformers to linear time, it often lags behind in expressivity due to the removal of softmax normalization. This omission eliminates \emph{global competition}, a critical mechanism that enables models to sharply focus on relevant information amidst long-context noise. In this work, we propose \textbf{Softmax Linear Attention (SLA)}, a framework designed to restore this competitive selection without sacrificing efficiency. By lifting the softmax operation from the token level to the head level, SLA leverages attention heads as coarse semantic slots, applying a competitive gating mechanism to dynamically select the most relevant subspaces. This reintroduces the ``winner-take-all'' dynamics essential for precise retrieval and robust long-context understanding. Distinct from prior methods that focus on refining local kernel functions, SLA adopts a broader perspective by exploiting the higher-level multi-head aggregation structure. Extensive experiments demonstrate that SLA consistently enhances state-of-the-art linear baselines (RetNet, GLA, GDN) across language modeling and long-context benchmarks, particularly in challenging retrieval scenarios where it significantly boosts robustness against noise, validating its capability to restore precise focus while maintaining linear complexity.
Abstract（参考訳）: リニアアテンションは標準変圧器の二次的な複雑さを線形時間に還元するが、ソフトマックス正規化の除去により表現性が遅れることがしばしばある。この省略により、長文ノイズの中でモデルが関連情報に鋭くフォーカスできる重要なメカニズムである「emph{global competition}」が排除される。本研究では,この競争的選択を効率を犠牲にすることなく復元するフレームワークであるtextbf{Softmax Linear Attention (SLA)を提案する。トークンレベルからヘッドレベルへのソフトマックス操作を持ち上げることで、SLAは注意ヘッドを粗いセマンティックスロットとして活用し、最も関連するサブスペースを動的に選択するための競合ゲーティング機構を適用します。これにより、精度の高い検索と堅牢な長文理解に不可欠な'winner-take-all'のダイナミクスを再導入する。ローカルカーネル関数の精細化に重点を置く従来の手法とは違い、SLAは高レベルなマルチヘッドアグリゲーション構造を利用することで、より広い視点を採用する。広範囲にわたる実験により、SLAは言語モデリングや長いコンテキストのベンチマークを通じて、最先端の線形ベースライン(RetNet、GLA、GDN)を一貫して強化することを示した。

論文の概要: Softmax Linear Attention: Reclaiming Global Competition

関連論文リスト