Fugu-MT 論文翻訳(概要): S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs

論文の概要: S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs

arxiv url: http://arxiv.org/abs/2605.18579v3
Date: Wed, 20 May 2026 03:15:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 14:55:44.320225
Title: S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs
Title（参考訳）: S2Aligner:スパーステキスト分散グラフのためのペア効率で転送可能な事前トレーニング
Authors: Yuhan Wang, Haopeng Zhang, Yibo Ding, Jiaqi Yu, Xinyu Zhao, Yuhang Liu, Ziwei Zhang, Xiao Wang, Ruijie Wang,
Abstract要約: テキスト分散グラフ(TAG)の事前トレーニングは、転送可能なグラフ基盤モデルの構築の中心である。本稿ではS2Alignerについて述べる。これはスパースTAGにおけるグラフテキスト事前学習のための空間認識と構造拡張 LLM-as-Aligner フレームワークである。
参考スコア（独自算出の注目度）: 23.83658846856642
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pre-training on text-attributed graphs (TAGs) is central to building transferable graph foundation models, where LLM-as-Aligner methods align graph and text representations through the semantic knowledge of large language models. However, these methods usually assume that node texts provide sufficient and reliable supervision, an assumption often violated in real-world sparse TAGs. When textual anchors are missing, noisy, or uneven across domains, graph structures must be aligned with weak semantic evidence, leading to unreliable structure-semantics correspondence and sparsity-induced transfer bias. This paper presents S2Aligner, a sparsity-aware and structure-enhanced LLM-as-Aligner framework for graph-text pre-training on sparse TAGs. The key idea is to decouple semantic alignment from structural modeling, allowing topology-aware signals to enhance alignment without contaminating the shared semantic space. Specifically, S2Aligner decomposes graph-text representations into semantic and structural components, uses structure-oriented reconstruction with consistency control to inject reliable topology cues into text representations, and suppresses inconsistent structural signals under textual sparsity. Moreover, S2Aligner introduces sparsity-aware cross-domain risk balancing, which calibrates domain risks through a global-domain density ratio and downweights unreliable sparse samples via graph reliability estimation. Theoretical analysis shows that this objective reduces cross-domain generalization gaps by controlling domain risk discrepancy. Extensive experiments across diverse graph domains, sparsity levels, and downstream tasks demonstrate that S2Aligner consistently outperforms existing baselines.
Abstract（参考訳）: テキスト分散グラフ(TAG)の事前トレーニングは、LLM-as-Alignerメソッドが大きな言語モデルのセマンティック知識を通じてグラフとテキスト表現を整列させる、転送可能なグラフ基盤モデルの構築の中心である。しかし、これらの手法は通常、ノードテキストが十分かつ信頼性の高い監視を提供すると仮定する。テキストアンカーが欠落、ノイズ、ドメイン間の不均一な場合、グラフ構造は弱いセマンティックな証拠と整合し、信頼できない構造-意味的対応と疎結合によって引き起こされる伝達バイアスをもたらす。本稿ではS2Alignerについて述べる。これはスパースTAGにおけるグラフテキスト事前学習のための空間認識と構造拡張 LLM-as-Aligner フレームワークである。キーとなる考え方は、構造モデリングとセマンティックアライメントを分離することであり、トポロジを意識した信号が共有セマンティック空間を汚染することなくアライメントを強化することができる。具体的には、S2Alignerは、グラフテキスト表現を意味的および構造的コンポーネントに分解し、整合性制御による構造指向の再構成を使用して、信頼性の高いトポロジキューをテキスト表現に注入し、テキスト空間下での不整合構造信号を抑制する。さらに、S2Alignerでは、グローバルドメイン密度比とダウンウェイトによるドメインリスクのキャリブレーションと、グラフ信頼度推定による信頼性の低いスパースサンプルによるドメインリスクのキャリブレーションを行う、パリシティ対応のクロスドメインリスクバランシングを導入している。理論的分析は、この目的がドメインリスクの相違を制御することによって、ドメイン間の一般化ギャップを減少させることを示している。多様なグラフドメイン、スパーシリティレベル、下流タスクにわたる大規模な実験は、S2Alignerが既存のベースラインを一貫して上回ることを示した。

論文の概要: S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs

関連論文リスト