Advective Diffusion Transformers for Topological Generalization in Graph
Learning
- URL: http://arxiv.org/abs/2310.06417v1
- Date: Tue, 10 Oct 2023 08:40:47 GMT
- Title: Advective Diffusion Transformers for Topological Generalization in Graph
Learning
- Authors: Qitian Wu, Chenxiao Yang, Kaipeng Zeng, Fan Nie, Michael Bronstein,
Junchi Yan
- Abstract summary: We show how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies.
We propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations.
- Score: 69.2894350228753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph diffusion equations are intimately related to graph neural networks
(GNNs) and have recently attracted attention as a principled framework for
analyzing GNN dynamics, formalizing their expressive power, and justifying
architectural choices. One key open questions in graph learning is the
generalization capabilities of GNNs. A major limitation of current approaches
hinges on the assumption that the graph topologies in the training and test
sets come from the same distribution. In this paper, we make steps towards
understanding the generalization of GNNs by exploring how graph diffusion
equations extrapolate and generalize in the presence of varying graph
topologies. We first show deficiencies in the generalization capability of
existing models built upon local diffusion on graphs, stemming from the
exponential sensitivity to topology variation. Our subsequent analysis reveals
the promise of non-local diffusion, which advocates for feature propagation
over fully-connected latent graphs, under the assumption of a specific
data-generating condition. In addition to these findings, we propose a novel
graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by
advective graph diffusion equations that have a closed-form solution backed up
with theoretical guarantees of desired generalization under topological
distribution shifts. The new model, functioning as a versatile graph
Transformer, demonstrates superior performance across a wide range of graph
learning tasks.
Related papers
- What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Graph Neural Aggregation-diffusion with Metastability [4.040326569845733]
Continuous graph neural models based on differential equations have expanded the architecture of graph neural networks (GNNs)
We propose GRADE inspired by graph aggregation-diffusion equations, which includes the delicate balance between nonlinear diffusion and aggregation induced by interaction potentials.
We prove that GRADE achieves competitive performance across various benchmarks and alleviates the over-smoothing issue in GNNs.
arXiv Detail & Related papers (2024-03-29T15:05:57Z) - Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs.
We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z) - MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs.
We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Neural Sheaf Diffusion: A Topological Perspective on Heterophily and
Oversmoothing in GNNs [16.88394293874848]
We use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs.
By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands.
We prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their behaviour.
arXiv Detail & Related papers (2022-02-09T17:25:02Z) - Generalization of graph network inferences in higher-order graphical
models [5.33024001730262]
Probabilistic graphical models provide a powerful tool to describe complex statistical structure.
inferences such as marginalization are intractable for general graphs.
We define the Recurrent Factor Graph Neural Network (RF-GNN) to achieve fast approximate inference on graphical models that involve many-variable interactions.
arXiv Detail & Related papers (2021-07-12T20:51:27Z) - GRAND: Graph Neural Diffusion [15.00135729657076]
We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process.
In our model, the layer structure and topology correspond to the discretisation choices of temporal and spatial operators.
Key to the success of our models are stability with respect to perturbations in the data and this is addressed for both implicit and explicit discretisation schemes.
arXiv Detail & Related papers (2021-06-21T09:10:57Z) - Hyperbolic Variational Graph Neural Network for Modeling Dynamic Graphs [77.33781731432163]
We learn dynamic graph representation in hyperbolic space, for the first time, which aims to infer node representations.
We present a novel Hyperbolic Variational Graph Network, referred to as HVGNN.
In particular, to model the dynamics, we introduce a Temporal GNN (TGNN) based on a theoretically grounded time encoding approach.
arXiv Detail & Related papers (2021-04-06T01:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.