Fugu-MT 論文翻訳(概要): Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases

論文の概要: Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases

arxiv url: http://arxiv.org/abs/2605.09472v1
Date: Sun, 10 May 2026 10:58:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.266893
Title: Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases
Title（参考訳）: 位置LSH: 2成分ブロック行列近似によるリニアビアーゼの注意度評価
Authors: Daniel Wolfson, Tal Wagner,
Abstract要約: 局所性感応性ハッシュ(LSH)レンズによる位置バイアスによる注意度の検討 ALiBi バイアス行列は,位置 LSH' スキームによって誘導される連続ブロック対角二乗マスクの期待値であることを示す。
参考スコア（独自算出の注目度）: 10.27725229355938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Positional encoding in transformers is commonly implemented through positional embeddings, attention masks, or bias terms, but formal connections between these mechanisms remain limited. We study attention with positional bias through the lens of locality-sensitive hashing (LSH), focusing on Attention with Linear Biases (ALiBi). We show that the ALiBi bias matrix is the expectation of contiguous block-diagonal binary masks induced by a ``positional LSH'' scheme. The empirical mean of masks sampled from this scheme yields spectral norm and max-norm approximation guarantees with bounded block sizes with high probability. This structural theorem implies a uniform approximation theorem for ALiBi-biased attention: with high probability over the sampled masks, the approximate attention output is accurate simultaneously for all query-key-value inputs and can be computed in near-linear time in the context length, reducing long-context ALiBi to a collection of randomized short-context regular (positionally unbiased) attention operations. Conceptually, this connects positional bias, masks, and positional embeddings in a single formal framework and suggests an approach to efficient ALiBi-biased attention. Experiments on large language models validate our theoretical findings.
Abstract（参考訳）: トランスにおける位置符号化は、一般に位置埋め込み、注意マスク、バイアス項によって実装されるが、これらのメカニズム間の正式な接続は限られている。局所性感応性ハッシュ (LSH) レンズを用いた位置バイアスによる注意調査を行い, 線形バイアス (ALiBi) による注意点に着目した。 ALiBi バイアス行列は ``positional LSH'' スキームによって誘導される連続ブロック対角二乗マスクの期待値であることを示す。このスキームからサンプリングされたマスクの実証平均は、スペクトルノルムと最大ノルム近似を高い確率で有界ブロックサイズで保証する。この構造定理は ALiBi-biased attention に対する一様近似定理を示唆する: サンプリングされたマスクよりも高い確率で、近似された注意出力は全てのクエリキー値入力に対して同時に正確であり、コンテキスト長においてほぼ直線時間で計算できる。概念的には、これは1つの正式な枠組みに位置バイアス、マスク、位置埋め込みを結びつけ、効率的なALiBiバイアス付き注意へのアプローチを提案する。大規模言語モデルの実験は、我々の理論的な結果を検証する。

論文の概要: Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases

関連論文リスト