Invariant Graph Transformer for Out-of-Distribution Generalization
- URL: http://arxiv.org/abs/2508.00304v1
- Date: Fri, 01 Aug 2025 04:11:53 GMT
- Title: Invariant Graph Transformer for Out-of-Distribution Generalization
- Authors: Tianyin Liao, Ziwei Zhang, Yufei Sun, Chunyu Hu, Jianxin Li,
- Abstract summary: We introduce Graph Out-Of-Distribution generalized Transformer (GOODFormer)<n>It aims to learn generalized graph representations by capturing invariant relationships between predictive graph structures and labels.<n>We design an evolving subgraph positional and structural encoder to effectively and efficiently capture the encoding information of dynamically changing subgraphs.
- Score: 21.60139614144787
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Graph Transformers (GTs) have demonstrated great effectiveness across various graph analytical tasks. However, the existing GTs focus on training and testing graph data originated from the same distribution, but fail to generalize under distribution shifts. Graph invariant learning, aiming to capture generalizable graph structural patterns with labels under distribution shifts, is potentially a promising solution, but how to design attention mechanisms and positional and structural encodings (PSEs) based on graph invariant learning principles remains challenging. To solve these challenges, we introduce Graph Out-Of-Distribution generalized Transformer (GOODFormer), aiming to learn generalized graph representations by capturing invariant relationships between predictive graph structures and labels through jointly optimizing three modules. Specifically, we first develop a GT-based entropy-guided invariant subgraph disentangler to separate invariant and variant subgraphs while preserving the sharpness of the attention function. Next, we design an evolving subgraph positional and structural encoder to effectively and efficiently capture the encoding information of dynamically changing subgraphs during training. Finally, we propose an invariant learning module utilizing subgraph node representations and encodings to derive generalizable graph representations that can to unseen graphs. We also provide theoretical justifications for our method. Extensive experiments on benchmark datasets demonstrate the superiority of our method over state-of-the-art baselines under distribution shifts.
Related papers
- Generative Risk Minimization for Out-of-Distribution Generalization on Graphs [71.48583448654522]
We propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction.<n>We conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization.
arXiv Detail & Related papers (2025-02-11T21:24:13Z) - Enhancing Distribution and Label Consistency for Graph Out-of-Distribution Generalization [45.84955654481374]
We introduce an innovative approach that aims to enhance two types of consistency for graph OOD generalization.<n>With the augmented graphs, we enrich the training data without compromising the integrity of label-graph relationships.<n>We conduct extensive experiments on real-world datasets to demonstrate the superiority of our framework over other state-of-the-art baselines.
arXiv Detail & Related papers (2025-01-07T19:19:22Z) - Subgraph Aggregation for Out-of-Distribution Generalization on Graphs [29.884717215947745]
SubGraph Aggregation (SuGAr) is designed to learn a diverse set of subgraphs that are crucial for OOD generalization on graphs.<n>Experiments on both synthetic and real-world datasets demonstrate that SuGAr outperforms state-of-the-art methods, achieving up to a 24% improvement in OOD generalization on graphs.
arXiv Detail & Related papers (2024-10-29T16:54:37Z) - What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Supercharging Graph Transformers with Advective Diffusion [28.40109111316014]
This paper proposes Advective Diffusion Transformer (AdvDIFFormer), a physics-inspired graph Transformer model designed to address this challenge.<n>We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts.<n> Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions.
arXiv Detail & Related papers (2023-10-10T08:40:47Z) - Discrete Graph Auto-Encoder [52.50288418639075]
We introduce a new framework named Discrete Graph Auto-Encoder (DGAE)
We first use a permutation-equivariant auto-encoder to convert graphs into sets of discrete latent node representations.
In the second step, we sort the sets of discrete latent representations and learn their distribution with a specifically designed auto-regressive model.
arXiv Detail & Related papers (2023-06-13T12:40:39Z) - Spectral Augmentations for Graph Contrastive Learning [50.149996923976836]
Contrastive learning has emerged as a premier method for learning representations with or without supervision.
Recent studies have shown its utility in graph representation learning for pre-training.
We propose a set of well-motivated graph transformation operations to provide a bank of candidates when constructing augmentations for a graph contrastive objective.
arXiv Detail & Related papers (2023-02-06T16:26:29Z) - Invariance Principle Meets Out-of-Distribution Generalization on Graphs [66.04137805277632]
Complex nature of graphs thwarts the adoption of the invariance principle for OOD generalization.
domain or environment partitions, which are often required by OOD methods, can be expensive to obtain for graphs.
We propose a novel framework to explicitly model this process using a contrastive strategy.
arXiv Detail & Related papers (2022-02-11T04:38:39Z) - Spectral Graph Convolutional Networks With Lifting-based Adaptive Graph
Wavelets [81.63035727821145]
Spectral graph convolutional networks (SGCNs) have been attracting increasing attention in graph representation learning.
We propose a novel class of spectral graph convolutional networks that implement graph convolutions with adaptive graph wavelets.
arXiv Detail & Related papers (2021-08-03T17:57:53Z) - GraphiT: Encoding Graph Structure in Transformers [37.33808493548781]
We show that viewing graphs as sets of node features and structural and positional information is able to outperform representations learned with classical graph neural networks (GNNs)
Our model, GraphiT, encodes such information by (i) leveraging relative positional encoding strategies in self-attention scores based on positive definite kernels on graphs, and (ii) enumerating and encoding local sub-structures such as paths of short length.
arXiv Detail & Related papers (2021-06-10T11:36:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.