A Generalization of ViT/MLP-Mixer to Graphs
- URL: http://arxiv.org/abs/2212.13350v2
- Date: Wed, 31 May 2023 03:19:44 GMT
- Title: A Generalization of ViT/MLP-Mixer to Graphs
- Authors: Xiaoxin He, Bryan Hooi, Thomas Laurent, Adam Perold, Yann LeCun,
Xavier Bresson
- Abstract summary: We introduce a new class of GNNs called Graph ViT/MLP-Mixer.
They capture long-range dependency and mitigate the issue of over-squashing.
They offer better speed and memory efficiency with a complexity linear to the number of nodes and edges.
- Score: 32.86160915431453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Networks (GNNs) have shown great potential in the field of graph
representation learning. Standard GNNs define a local message-passing mechanism
which propagates information over the whole graph domain by stacking multiple
layers. This paradigm suffers from two major limitations, over-squashing and
poor long-range dependencies, that can be solved using global attention but
significantly increases the computational cost to quadratic complexity. In this
work, we propose an alternative approach to overcome these structural
limitations by leveraging the ViT/MLP-Mixer architectures introduced in
computer vision. We introduce a new class of GNNs, called Graph ViT/MLP-Mixer,
that holds three key properties. First, they capture long-range dependency and
mitigate the issue of over-squashing as demonstrated on Long Range Graph
Benchmark and TreeNeighbourMatch datasets. Second, they offer better speed and
memory efficiency with a complexity linear to the number of nodes and edges,
surpassing the related Graph Transformer and expressive GNN models. Third, they
show high expressivity in terms of graph isomorphism as they can distinguish at
least 3-WL non-isomorphic graphs. We test our architecture on 4 simulated
datasets and 7 real-world benchmarks, and show highly competitive results on
all of them. The source code is available for reproducibility at:
\url{https://github.com/XiaoxinHe/Graph-ViT-MLPMixer}.
Related papers
- Learning Long Range Dependencies on Graphs via Random Walks [6.7864586321550595]
Message-passing graph neural networks (GNNs) excel at capturing local relationships but struggle with long-range dependencies in graphs.
graph transformers (GTs) enable global information exchange but often oversimplify the graph structure by representing graphs as sets of fixed-length vectors.
This work introduces a novel architecture that overcomes the shortcomings of both approaches by combining the long-range information of random walks with local message passing.
arXiv Detail & Related papers (2024-06-05T15:36:57Z) - Learning on Large Graphs using Intersecting Communities [13.053266613831447]
MPNNs iteratively update each node's representation in an input graph by aggregating messages from the node's neighbors.
MPNNs might quickly become prohibitive for large graphs provided they are not very sparse.
We propose approximating the input graph as an intersecting community graph (ICG) -- a combination of intersecting cliques.
arXiv Detail & Related papers (2024-05-31T09:26:26Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Graph Sparsification via Mixture of Graphs [67.40204130771967]
We introduce Mixture-of-Graphs (MoG) to dynamically select tailored pruning solutions for each node.
MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.
Experiments on four large-scale OGB datasets and two superpixel datasets, equipped with five GNNs, demonstrate that MoG identifies subgraphs at higher sparsity levels.
arXiv Detail & Related papers (2024-05-23T07:40:21Z) - SpikeGraphormer: A High-Performance Graph Transformer with Spiking Graph Attention [1.4126245676224705]
Graph Transformers have emerged as a promising solution to alleviate the inherent limitations of Graph Neural Networks (GNNs)
We propose a novel insight into integrating SNNs with Graph Transformers and design a Spiking Graph Attention (SGA) module.
SpikeGraphormer consistently outperforms existing state-of-the-art approaches across various datasets.
arXiv Detail & Related papers (2024-03-21T03:11:53Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications [44.15102300886821]
We propose a novel graph mixup algorithm called FGWMixup, which seeks a midpoint of source graphs in the Fused Gromov-Wasserstein metric space.
Experiments conducted on five datasets demonstrate that FGWMixup effectively improves the generalizability and robustness of GNNs.
arXiv Detail & Related papers (2023-06-28T07:00:12Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit
Diversity Modeling [60.0185734837814]
Graph neural networks (GNNs) have found extensive applications in learning from graph data.
To bolster the generalization capacity of GNNs, it has become customary to augment training graph structures with techniques like graph augmentations.
This study introduces the concept of Mixture-of-Experts (MoE) to GNNs, with the aim of augmenting their capacity to adapt to a diverse range of training graph structures.
arXiv Detail & Related papers (2023-04-06T01:09:36Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data.
Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks.
In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.