Related papers: Bridging Graph Position Encodings for Transformers with Weighted Graph-Walking Automata

Bridging Graph Position Encodings for Transformers with Weighted Graph-Walking Automata

URL: http://arxiv.org/abs/2212.06898v4
Date: Sun, 9 Apr 2023 17:43:18 GMT
Title: Bridging Graph Position Encodings for Transformers with Weighted Graph-Walking Automata
Authors: Patrick Soga, David Chiang
Abstract summary: We introduce a new graph PE, Graph Automaton PE (GAPE), based on weighted graph-walking automata. We compare the performance of GAPE with other PE schemes on both machine translation and graph-structured tasks.
Score: 20.539191533339427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A current goal in the graph neural network literature is to enable transformers to operate on graph-structured data, given their success on language and vision tasks. Since the transformer's original sinusoidal positional encodings (PEs) are not applicable to graphs, recent work has focused on developing graph PEs, rooted in spectral graph theory or various spatial features of a graph. In this work, we introduce a new graph PE, Graph Automaton PE (GAPE), based on weighted graph-walking automata (a novel extension of graph-walking automata). We compare the performance of GAPE with other PE schemes on both machine translation and graph-structured tasks, and we show that it generalizes several other PEs. An additional contribution of this study is a theoretical and controlled experimental comparison of many recent PEs in graph transformers, independent of the use of edge features.

Related papers

A Survey of Graph Transformers: Architectures, Theories and Applications [54.561539625830186]
Recent studies have proposed diverse architectures, enhanced explainability, and practical applications for Graph Transformers. We categorize the architecture of Graph Transformers according to their strategies for processing structural information. We provide a summary of the practical applications where Graph Transformers have been utilized, such as molecule, protein, language, vision, traffic, brain and material data.
arXiv Detail & Related papers (2025-02-23T10:55:19Z)
Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs [16.249474010042736]
We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers. Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes.
arXiv Detail & Related papers (2025-02-17T22:35:16Z)
Graph Transformers: A Survey [15.68583521879617]
Graph transformers are a recent advancement in machine learning, offering a new class of neural network models for graph-structured data. This survey provides an in-depth review of recent progress and challenges in graph transformer research.
arXiv Detail & Related papers (2024-07-13T05:15:24Z)
Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classification [71.36575018271405]
We introduce the Dual-Prism (DP) augmentation method, comprising DP-Noise and DP-Mask. We find that keeping the low-frequency eigenvalues unchanged can preserve the critical properties at a large scale when generating augmented graphs.
arXiv Detail & Related papers (2024-01-18T12:58:53Z)
Transformers as Graph-to-Graph Models [13.630495199720423]
We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Our Graph-to-Graph Transformer architecture makes this ability explicit, by inputting graph edges into the attention weight computations and predicting graph edges with attention-like functions.
arXiv Detail & Related papers (2023-10-27T07:21:37Z)
Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models. We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning. By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z)
Spectral Augmentations for Graph Contrastive Learning [50.149996923976836]
Contrastive learning has emerged as a premier method for learning representations with or without supervision. Recent studies have shown its utility in graph representation learning for pre-training. We propose a set of well-motivated graph transformation operations to provide a bank of candidates when constructing augmentations for a graph contrastive objective.
arXiv Detail & Related papers (2023-02-06T16:26:29Z)
Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs. We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z)
Transformer for Graphs: An Overview from Architecture Perspective [86.3545861392215]
It's imperative to sort out the existing Transformer models for graphs and systematically investigate their effectiveness on various graph tasks. We first disassemble the existing models and conclude three typical ways to incorporate the graph information into the vanilla Transformer. Our experiments confirm the benefits of current graph-specific modules on Transformer and reveal their advantages on different kinds of graph tasks.
arXiv Detail & Related papers (2022-02-17T06:02:06Z)
Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z)
Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
Dirichlet Graph Variational Autoencoder [65.94744123832338]
We present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster memberships as latent factors. Motivated by the low pass characteristics in balanced graph cut, we propose a new variant of GNN named Heatts to encode the input graph into cluster memberships.
arXiv Detail & Related papers (2020-10-09T07:35:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.