Fugu-MT 論文翻訳(概要): When Do Transformers Learn Heuristics for Graph Connectivity?

論文の概要: When Do Transformers Learn Heuristics for Graph Connectivity?

arxiv url: http://arxiv.org/abs/2510.19753v1
Date: Wed, 22 Oct 2025 16:43:32 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:16.145896
Title: When Do Transformers Learn Heuristics for Graph Connectivity?
Title（参考訳）: トランスフォーマーはグラフ接続性のヒューリスティックスを学ぶか?
Authors: Qilin Ye, Deqing Fu, Robin Jia, Vatsal Sharan,
Abstract要約: 我々は、直径が$3Lのグラフに対して、$L$層モデルで解く能力があることを証明した。トレーニングの力学を解析し、学習した戦略が、ほとんどのトレーニングインスタンスがこのモデルのキャパシティ内にあるかどうかにかかっていることを示す。
参考スコア（独自算出の注目度）: 33.73385470817422
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformers often fail to learn generalizable algorithms, instead relying on brittle heuristics. Using graph connectivity as a testbed, we explain this phenomenon both theoretically and empirically. We consider a simplified Transformer architecture, the disentangled Transformer, and prove that an $L$-layer model has capacity to solve for graphs with diameters up to exactly $3^L$, implementing an algorithm equivalent to computing powers of the adjacency matrix. We analyze the training-dynamics, and show that the learned strategy hinges on whether most training instances are within this model capacity. Within-capacity graphs (diameter $\leq 3^L$) drive the learning of a correct algorithmic solution while beyond-capacity graphs drive the learning of a simple heuristic based on node degrees. Finally, we empirically demonstrate that restricting training data within a model's capacity leads to both standard and disentangled transformers learning the exact algorithm rather than the degree-based heuristic.
Abstract（参考訳）: トランスフォーマーはしばしば、不安定なヒューリスティックに頼らず、一般化可能なアルゴリズムを学ばない。グラフ接続をテストベッドとして使用し、理論的にも経験的にもこの現象を説明する。我々は, 単純化されたトランスフォーマーアーキテクチャ, アンタングル型トランスフォーマーを考察し, 共役行列の計算能力に匹敵するアルゴリズムを実装した, 直径が3^L$のグラフに対して, L$層モデルで解く能力があることを証明した。トレーニングの力学を解析し、学習した戦略が、ほとんどのトレーニングインスタンスがこのモデルのキャパシティ内にあるかどうかにかかっていることを示す。容量内グラフ (diameter $\leq 3^L$) は正しいアルゴリズム解の学習を駆動し、容量外グラフはノード次数に基づく単純なヒューリスティックの学習を駆動する。最後に、モデルのキャパシティ内でのトレーニングデータ制限が、次数に基づくヒューリスティックではなく、正確なアルゴリズムを学習する標準トランスフォーマーと非整合トランスフォーマーの両方につながることを実証的に示す。

論文の概要: When Do Transformers Learn Heuristics for Graph Connectivity?

関連論文リスト