Fugu-MT 論文翻訳(概要): Tabula RASA: Exposing and Breaking the Relational Bottleneck in Transformers

論文の概要: Tabula RASA: Exposing and Breaking the Relational Bottleneck in Transformers

arxiv url: http://arxiv.org/abs/2602.02834v2
Date: Wed, 04 Feb 2026 11:02:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-05 15:07:33.798522
Title: Tabula RASA: Exposing and Breaking the Relational Bottleneck in Transformers
Title（参考訳）: RASA Tabula: トランスフォーマーにおけるリレーショナルボツネックの公開と破壊
Authors: Jonas Petersen, Camilla Mazzoleni, Riccardo Maggioni,
Abstract要約: RASA(Relation-Aware Sparse Attention)は、リレーショナル推論のための構造的帰納バイアスを提供する最小限のアーキテクチャ修正である。以上の結果から,複雑性理論解析に基づく最小限のアーキテクチャ変更は,マルチホップ推論を大幅に改善できることが示された。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformers achieve remarkable performance across many domains, yet struggle with tasks requiring multi-hop relational reasoning over structured data. We analyze this limitation through circuit complexity: standard transformers are $\mathsf{TC}^0$-complete and cannot solve graph connectivity in constant depth, implying $Ω(k)$ layers are necessary for $k$-hop reasoning regardless of model size or training data. We introduce RASA (Relation-Aware Sparse Attention), a minimal architectural modification that provides structural inductive bias for relational reasoning. RASA adds: (1) sparse adjacency masking that restricts attention to graph-connected positions, reducing the attention pattern search space from $O(2^{n^2})$ to $O(2^m)$ for graphs with $m$ edges; and (2) learnable edge-type biases that encode relation-specific attention preferences. While RASA does not circumvent asymptotic depth requirements, the exponential reduction in attention pattern space provides stronger inductive bias for learning graph-structured functions. Empirically, on the MetaQA knowledge graph QA benchmark, RASA achieves 97.7% accuracy on 3-hop questions, outperforming EmbedKGQA (94.8%) by 2.9 percentage points. Notably, RASA's advantage grows with reasoning depth, validating that structural inductive bias is most beneficial for complex multi-hop queries. Our results demonstrate that minimal architectural modifications, grounded in complexity-theoretic analysis, can substantially improve multi-hop reasoning.
Abstract（参考訳）: トランスフォーマーは多くのドメインで顕著なパフォーマンスを達成するが、構造化データに対するマルチホップリレーショナル推論を必要とするタスクに苦労する。標準変換器は$\mathsf{TC}^0$-completeであり、一定の深さでグラフ接続を解くことができない。 RASA(Relation-Aware Sparse Attention)は、リレーショナル推論のための構造的帰納バイアスを提供する最小限のアーキテクチャ修正である。 1)グラフ接続された位置への注意を制限し、注意パターン探索空間を$O(2^{n^2})$から$O(2^m)$に減らし、(2)関係性固有の注意傾向を符号化する学習可能なエッジ型バイアス。 RASAは漸近的な深度要求を回避しないが、注意パターン空間の指数的減少はグラフ構造関数の学習に強い帰納バイアスを与える。経験的に、MetaQAナレッジグラフQAベンチマークでは、3ホップの質問に対して97.7%の精度を達成し、EmbedKGQA(94.8%)を2.9%上回っている。特に、RASAの利点は推論の深さによって増大し、構造的帰納バイアスが複雑なマルチホップクエリにとって最も有益であることを示す。以上の結果から,複雑性理論解析に基づく最小限のアーキテクチャ変更は,マルチホップ推論を大幅に改善できることが示された。

論文の概要: Tabula RASA: Exposing and Breaking the Relational Bottleneck in Transformers

関連論文リスト