Fugu-MT 論文翻訳(概要): SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations

論文の概要: SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations

arxiv url: http://arxiv.org/abs/2306.10759v5
Date: Fri, 16 Aug 2024 08:24:25 GMT
ステータス: 翻訳完了
システム内更新日: 2024-08-19 21:05:52.208607
Title: SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations
Title（参考訳）: SGFormer: 大きなグラフ表現のための変換器の簡素化と強化
Authors: Qitian Wu, Wentao Zhao, Chenxiao Yang, Hengrui Zhang, Fan Nie, Haitian Jiang, Yatao Bian, Junchi Yan,
Abstract要約: ノード特性予測ベンチマークにおいて,一層注意が驚くほど高い性能を示すことを示す。提案手法をSGFormer (Simplified Graph Transformer) と呼ぶ。提案手法は,大きなグラフ上にトランスフォーマーを構築する上で,独立性のある新たな技術パスを啓蒙するものである。
参考スコア（独自算出の注目度）: 75.71298846760303
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning representations on large-sized graphs is a long-standing challenge due to the inter-dependence nature involved in massive data points. Transformers, as an emerging class of foundation encoders for graph-structured data, have shown promising performance on small graphs due to its global attention capable of capturing all-pair influence beyond neighboring nodes. Even so, existing approaches tend to inherit the spirit of Transformers in language and vision tasks, and embrace complicated models by stacking deep multi-head attentions. In this paper, we critically demonstrate that even using a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks where node numbers range from thousand-level to billion-level. This encourages us to rethink the design philosophy for Transformers on large graphs, where the global attention is a computation overhead hindering the scalability. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model that can efficiently propagate information among arbitrary nodes in one layer. SGFormer requires none of positional encodings, feature/graph pre-processing or augmented loss. Empirically, SGFormer successfully scales to the web-scale graph ogbn-papers100M and yields up to 141x inference acceleration over SOTA Transformers on medium-sized graphs. Beyond current results, we believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
Abstract（参考訳）: 大規模グラフ上での学習表現は、大規模なデータポイントに関わる依存性間の性質のため、長年にわたる課題である。グラフ構造化データのための基盤エンコーダの新たなクラスであるトランスフォーマーは、隣接するノードを越えて全ペアの影響を捉えることができるグローバルな注目のために、小さなグラフ上で有望なパフォーマンスを示している。それでも、既存のアプローチは、言語や視覚タスクにおけるトランスフォーマーの精神を継承し、深いマルチヘッドの注意を積み重ねることで複雑なモデルを受け入れる傾向があります。本稿では,一層注意を用いた場合であっても,ノード数が数千レベルから10億レベルに及ぶノード特性予測ベンチマークにおいて,驚くほどの競合性能が得られることを批判的に示す。これにより、大きなグラフ上でTransformerの設計哲学を再考し、グローバルな注目はスケーラビリティを妨げる計算オーバーヘッドである。提案手法をSGFormer (Simplified Graph Transformer) として,任意のノード間の情報を一層に効率よく伝播するシンプルなアテンションモデルで実現した。 SGFormerは、位置エンコーディング、フィーチャ/グラフ前処理、拡張損失を必要としない。実証的には、SGFormerはWebスケールグラフogbn-papers100Mにスケールし、中規模のグラフ上でSOTA変換器上で最大141倍の推論加速度を得る。提案手法は,現在の結果以外にも,大規模なグラフ上にトランスフォーマーを構築する上で,独立性のある新たな技術パスを啓蒙するものだと考えている。

論文の概要: SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations

関連論文リスト