Related papers: Advective Diffusion Transformers for Topological Generalization in Graph Learning

Advective Diffusion Transformers for Topological Generalization in Graph Learning

URL: http://arxiv.org/abs/2310.06417v1
Date: Tue, 10 Oct 2023 08:40:47 GMT
Title: Advective Diffusion Transformers for Topological Generalization in Graph Learning
Authors: Qitian Wu, Chenxiao Yang, Kaipeng Zeng, Fan Nie, Michael Bronstein, Junchi Yan
Abstract summary: We show how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies. We propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations.
Score: 69.2894350228753
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph diffusion equations are intimately related to graph neural networks (GNNs) and have recently attracted attention as a principled framework for analyzing GNN dynamics, formalizing their expressive power, and justifying architectural choices. One key open questions in graph learning is the generalization capabilities of GNNs. A major limitation of current approaches hinges on the assumption that the graph topologies in the training and test sets come from the same distribution. In this paper, we make steps towards understanding the generalization of GNNs by exploring how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies. We first show deficiencies in the generalization capability of existing models built upon local diffusion on graphs, stemming from the exponential sensitivity to topology variation. Our subsequent analysis reveals the promise of non-local diffusion, which advocates for feature propagation over fully-connected latent graphs, under the assumption of a specific data-generating condition. In addition to these findings, we propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations that have a closed-form solution backed up with theoretical guarantees of desired generalization under topological distribution shifts. The new model, functioning as a versatile graph Transformer, demonstrates superior performance across a wide range of graph learning tasks.

Related papers

Do Graph Diffusion Models Accurately Capture and Generate Substructure Distributions? [28.19526635775658]
Diffusion models do not possess universal expressivity to accurately model the distribution scores of complex graph data. Our work addresses this limitation by focusing on the frequency of specific substructures as a key characteristic of target graph distributions. We establish a theoretical connection between the expressivity of Graph Neural Networks (GNNs) and the overall performance of graph diffusion models.
arXiv Detail & Related papers (2025-02-04T17:04:16Z)
Transformers from Diffusion: A Unified Framework for Neural Message Passing [79.9193447649011]
Message passing neural networks (MPNNs) have become a de facto class of model solutions.<n>We propose an energy-constrained diffusion model, which integrates the inductive bias of diffusion with layer-wise constraints of energy.<n>Building on these insights, we devise a new class of message passing models, dubbed Transformers (DIFFormer), whose global attention layers are derived from the principled energy-constrained diffusion framework.
arXiv Detail & Related papers (2024-09-13T17:54:41Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks. This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z)
Graph Neural Aggregation-diffusion with Metastability [4.040326569845733]
Continuous graph neural models based on differential equations have expanded the architecture of graph neural networks (GNNs) We propose GRADE inspired by graph aggregation-diffusion equations, which includes the delicate balance between nonlinear diffusion and aggregation induced by interaction potentials. We prove that GRADE achieves competitive performance across various benchmarks and alleviates the over-smoothing issue in GNNs.
arXiv Detail & Related papers (2024-03-29T15:05:57Z)
Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs. We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z)
PGODE: Towards High-quality System Dynamics Modeling [40.76121531452706]
This paper studies the problem of modeling multi-agent dynamical systems, where agents could interact mutually to influence their behaviors. Recent research predominantly uses geometric graphs to depict these mutual interactions, which are then captured by graph neural networks (GNNs) We propose a new approach named Prototypical Graph ODE to address the problem.
arXiv Detail & Related papers (2023-11-11T12:04:47Z)
Gramian Angular Fields for leveraging pretrained computer vision models with anomalous diffusion trajectories [0.9012198585960443]
We present a new data-driven method for working with diffusive trajectories. This method utilizes Gramian Angular Fields (GAF) to encode one-dimensional trajectories as images. We leverage two well-established pre-trained computer-vision models, ResNet and MobileNet, to characterize the underlying diffusive regime.
arXiv Detail & Related papers (2023-09-02T17:22:45Z)
SEGNO: Generalizing Equivariant Graph Neural Networks with Physical Inductive Biases [66.61789780666727]
We show how the second-order continuity can be incorporated into GNNs while maintaining the equivariant property. We also offer theoretical insights into SEGNO, highlighting that it can learn a unique trajectory between adjacent states. Our model yields a significant improvement over the state-of-the-art baselines.
arXiv Detail & Related papers (2023-08-25T07:15:58Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs. We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z)
Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem. In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN) GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z)
Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs [16.88394293874848]
We use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. We prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their behaviour.
arXiv Detail & Related papers (2022-02-09T17:25:02Z)
Graph Kernel Neural Networks [53.91024360329517]
We propose to use graph kernels, i.e. kernel functions that compute an inner product on graphs, to extend the standard convolution operator to the graph domain. This allows us to define an entirely structural model that does not require computing the embedding of the input graph. Our architecture allows to plug-in any type of graph kernels and has the added benefit of providing some interpretability.
arXiv Detail & Related papers (2021-12-14T14:48:08Z)
Generalization of graph network inferences in higher-order graphical models [5.33024001730262]
Probabilistic graphical models provide a powerful tool to describe complex statistical structure. inferences such as marginalization are intractable for general graphs. We define the Recurrent Factor Graph Neural Network (RF-GNN) to achieve fast approximate inference on graphical models that involve many-variable interactions.
arXiv Detail & Related papers (2021-07-12T20:51:27Z)
GRAND: Graph Neural Diffusion [15.00135729657076]
We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process. In our model, the layer structure and topology correspond to the discretisation choices of temporal and spatial operators. Key to the success of our models are stability with respect to perturbations in the data and this is addressed for both implicit and explicit discretisation schemes.
arXiv Detail & Related papers (2021-06-21T09:10:57Z)
Hyperbolic Variational Graph Neural Network for Modeling Dynamic Graphs [77.33781731432163]
We learn dynamic graph representation in hyperbolic space, for the first time, which aims to infer node representations. We present a novel Hyperbolic Variational Graph Network, referred to as HVGNN. In particular, to model the dynamics, we introduce a Temporal GNN (TGNN) based on a theoretically grounded time encoding approach.
arXiv Detail & Related papers (2021-04-06T01:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.