Fugu-MT 論文翻訳(概要): Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative

論文の概要: Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative

arxiv url: http://arxiv.org/abs/2605.07345v1
Date: Fri, 08 May 2026 06:48:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.872731
Title: Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative
Title（参考訳）: 平均ポーリングコサイン類似性は長さ不変ではない:長さ不変な代替物の理論とクロスドメインエビデンス
Authors: Sibayan Mitra, Dhruv Kumar,
Abstract要約: 平均プールされたコサイン類似性は、言語、モダリティ、タスク間の神経表現を比較するためのデフォルトの指標である。現代の変圧器表現を特徴づける異方性の下では、平均プールされたコサインは配列長で単調に成長する。我々は、Centered Kernel Alignmentのような長さ不変のメトリクスは、クロス表現比較のデフォルトであるべきだと論じる。
参考スコア（独自算出の注目度）: 1.5718921092089344
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mean-pooled cosine similarity is the default metric for comparing neural representations across languages, modalities, and tasks. We establish that this metric is not length-invariant: under the anisotropy that characterizes modern transformer representations, mean-pooled cosine grows monotonically in sequence length, independent of representational content. Empirically, on HumanEvalPack across four code LLMs, the length ratio alone explains $R^2 = 0.52$--$0.75$ of cross-language "Python proximity," while AST depth and shared-token fraction add less than 3% of explained variance beyond length. Substituting Centered Kernel Alignment (CKA) reduces explained variance by 83% and reverses the sign of the length coefficient ($β_{\mathrm{len}}: +0.86 \to -0.37$). The same pattern holds in Mistral-7B on parallel WMT pairs ($R^2 = 0.23$ EN-FR, $R^2 = 0.33$ EN-DE for cosine; $R^2 < 0.01$ for CKA). In CLIP ViT-B/32, mean-pooling reduces the length effect relative to EOS-pooling ($R^2: 0.21 \to {<}0.01$), as predicted by the theory's dependence on anisotropy. We argue that length-invariant metrics such as CKA should be the default for cross-representation comparisons, and that recent claims of cross-lingual representational convergence built on mean-pooled cosine warrant re-examination.
Abstract（参考訳）: 平均プールされたコサイン類似性は、言語、モダリティ、タスク間の神経表現を比較するためのデフォルトの指標である。この計量は長さ不変ではなく、現代の変圧器表現を特徴付ける異方性の下で、平均プールされたコサインは、表現内容とは独立に、列長で単調に成長する。実証的に、HumanEvalPackの4つのコード LLM で、長さ比だけでは、R^2 = 0.52$--0.75$のクロスランゲージな"Python近接性"を説明できる。 CKA(Centered Kernel Alignment)の置換は、説明された分散を83%削減し、長さ係数の符号を反転させる(β_{\mathrm{len}}: +0.86 \to -0.37$)。同じパターンはMistral-7B において並列 WMT 対 (R^2 = 0.23$ EN-FR, $R^2 = 0.33$ EN-DE for cosine; $R^2 < 0.01$ for CKA) で保持される。 CLIP ViT-B/32では、平均プーリングは、理論の異方性への依存によって予測されるEOSプーリング(R^2: 0.21 \to {<}0.01$)に対する長さ効果を減少させる。我々は、CKAのような長さ不変なメトリクスは、クロス表現比較のデフォルトであるべきであり、また、平均プールされたコサイン保証の再検査に基づいて構築された言語間表現収束の最近の主張を論じる。

論文の概要: Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative

関連論文リスト