A Pure Transformer Pretraining Framework on Text-attributed Graphs
- URL: http://arxiv.org/abs/2406.13873v1
- Date: Wed, 19 Jun 2024 22:30:08 GMT
- Title: A Pure Transformer Pretraining Framework on Text-attributed Graphs
- Authors: Yu Song, Haitao Mao, Jiachen Xiao, Jingzhe Liu, Zhikai Chen, Wei Jin, Carl Yang, Jiliang Tang, Hui Liu,
- Abstract summary: We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
- Score: 50.833130854272774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges such as feature heterogeneity and structural heterogeneity. Recently, increasing efforts have been made to enhance node feature quality with Large Language Models (LLMs) on text-attributed graphs (TAGs), demonstrating superiority to traditional bag-of-words or word2vec techniques. These high-quality node features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified feature space to learn refined interaction patterns that generalizes across graphs. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks and employs masked feature reconstruction to capture pairwise proximity in the LLM-unified feature space using a standard Transformer. By utilizing unified text representations rather than varying structures, our framework achieves significantly better transferability among graphs within the same domain. GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Efficient Topology-aware Data Augmentation for High-Degree Graph Neural Networks [2.7523980737007414]
We propose TADA, an efficient and effective front-mounted data augmentation framework for graph neural networks (GNNs) on high-degree graphs (HDGs)
Under the hood, TADA includes two key modules: (i) feature expansion with structure embeddings, and (ii) topology- and attribute-aware graph sparsification.
TADA considerably improves the predictive performance of mainstream GNN models on 8 real homophilic/heterophilic HDGs in terms of node classification.
arXiv Detail & Related papers (2024-06-08T14:14:19Z) - Unleashing the Potential of Text-attributed Graphs: Automatic Relation Decomposition via Large Language Models [31.443478448031886]
RoSE (Relation-oriented Semantic Edge-decomposition) is a novel framework that decomposes the graph structure by analyzing raw text attributes.
Our framework significantly enhances node classification performance across various datasets, with improvements of up to 16% on the Wisconsin dataset.
arXiv Detail & Related papers (2024-05-28T20:54:47Z) - DGNN: Decoupled Graph Neural Networks with Structural Consistency
between Attribute and Graph Embedding Representations [62.04558318166396]
Graph neural networks (GNNs) demonstrate a robust capability for representation learning on graphs with complex structures.
A novel GNNs framework, dubbed Decoupled Graph Neural Networks (DGNN), is introduced to obtain a more comprehensive embedding representation of nodes.
Experimental results conducted on several graph benchmark datasets verify DGNN's superiority in node classification task.
arXiv Detail & Related papers (2024-01-28T06:43:13Z) - HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained
Heterogeneous Graph Neural Networks [24.435068514392487]
HetGPT is a post-training prompting framework for graph neural networks.
It improves the performance of state-of-the-art HGNNs on semi-supervised node classification.
arXiv Detail & Related papers (2023-10-23T19:35:57Z) - TouchUp-G: Improving Feature Representation through Graph-Centric
Finetuning [37.318961625795204]
Graph Neural Networks (GNNs) have become the state-of-the-art approach for many high-impact, real-world graph applications.
For feature-rich graphs, a prevalent practice involves utilizing a PM directly to generate features.
This practice is suboptimal because the node features extracted from PM are graph-agnostic and prevent GNNs from fully utilizing the potential correlations between the graph structure and node features.
arXiv Detail & Related papers (2023-09-25T05:44:40Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Deformable Graph Transformer [31.254872949603982]
We propose Deformable Graph Transformer (DGT) that performs sparse attention with dynamically sampled key and value pairs.
Experiments demonstrate that our novel graph Transformer consistently outperforms existing Transformer-based models.
arXiv Detail & Related papers (2022-06-29T00:23:25Z) - ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network [72.16255675586089]
We propose an Adaptive Curvature Exploration Hyperbolic Graph NeuralNetwork named ACE-HGNN to adaptively learn the optimal curvature according to the input graph and downstream tasks.
Experiments on multiple real-world graph datasets demonstrate a significant and consistent performance improvement in model quality with competitive performance and good generalization ability.
arXiv Detail & Related papers (2021-10-15T07:18:57Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.