Graph Masked Autoencoder
- URL: http://arxiv.org/abs/2202.08391v1
- Date: Thu, 17 Feb 2022 01:04:32 GMT
- Title: Graph Masked Autoencoder
- Authors: Hongxu Chen, Sixiao Zhang, Guandong Xu
- Abstract summary: We propose Graph Masked Autoencoders (GMAE), a self-supervised model for learning graph representations.
GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes.
We show that, compared with training from scratch, the graph transformer pre-trained using GMAE can achieve much better performance after fine-tuning.
- Score: 19.080326439575916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers have achieved state-of-the-art performance in learning graph
representations. However, there are still some challenges when applying
transformers to real-world scenarios due to the fact that deep transformers are
hard to be trained from scratch and the memory consumption is large. To address
the two challenges, we propose Graph Masked Autoencoders (GMAE), a
self-supervised model for learning graph representations, where vanilla graph
transformers are used as the encoder and the decoder. GMAE takes partially
masked graphs as input, and reconstructs the features of the masked nodes. We
adopt asymmetric encoder-decoder design, where the encoder is a deep graph
transformer and the decoder is a shallow graph transformer. The masking
mechanism and the asymmetric design make GMAE a memory-efficient model compared
with conventional transformers. We show that, compared with training from
scratch, the graph transformer pre-trained using GMAE can achieve much better
performance after fine-tuning. We also show that, when serving as a
conventional self-supervised graph representation model and using an SVM model
as the downstream graph classifier, GMAE achieves state-of-the-art performance
on 5 of the 7 benchmark datasets.
Related papers
- Learning Graph Quantized Tokenizers for Transformers [28.79505338383552]
Graph Transformers (GTs) have emerged as a leading model in deep learning, outperforming Graph Neural Networks (GNNs) in various graph learning tasks.
We introduce GQT (textbfGraph textbfQuantized textbfTokenizer), which decouples tokenizer training from Transformer training by leveraging graph self-supervised learning.
By combining the GQT with token modulation, a Transformer encoder achieves state-of-the-art performance on 16 out of 18 benchmarks, including large-scale homophilic and heterophilic datasets.
arXiv Detail & Related papers (2024-10-17T17:38:24Z) - Graph as Point Set [31.448841287258116]
This paper introduces a novel graph-to-set conversion method that transforms interconnected nodes into a set of independent points.
It enables using set encoders to learn from graphs, thereby significantly expanding the design space of Graph Neural Networks.
To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input.
arXiv Detail & Related papers (2024-05-05T02:29:41Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - AGFormer: Efficient Graph Representation with Anchor-Graph Transformer [95.1825252182316]
We propose a novel graph Transformer architecture, termed Anchor Graph Transformer (AGFormer)
AGFormer first obtains some representative anchors and then converts node-to-node message passing into anchor-to-anchor and anchor-to-node message passing process.
Extensive experiments on several benchmark datasets demonstrate the effectiveness and benefits of proposed AGFormer.
arXiv Detail & Related papers (2023-05-12T14:35:42Z) - Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth.
Deep graph transformers are limited by the vanishing capacity of global attention.
We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z) - What Dense Graph Do You Need for Self-Attention? [73.82686008622596]
We present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer.
Experiments on tasks requiring various sequence lengths lay validation for our graph function well.
arXiv Detail & Related papers (2022-05-27T14:36:55Z) - Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs.
Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability.
In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z) - Transformer-Based Deep Image Matching for Generalizable Person
Re-identification [114.56752624945142]
We investigate the possibility of applying Transformers for image matching and metric learning given pairs of images.
We find that the Vision Transformer (ViT) and the vanilla Transformer with decoders are not adequate for image matching due to their lack of image-to-image attention.
We propose a new simplified decoder, which drops the full attention implementation with the softmax weighting, keeping only the query-key similarity.
arXiv Detail & Related papers (2021-05-30T05:38:33Z) - Graph-Aware Transformer: Is Attention All Graphs Need? [5.240000443825077]
GRaph-Aware Transformer (GRAT) is first Transformer-based model which can encode and decode whole graphs in end-to-end fashion.
GRAT has shown very promising results including state-of-the-art performance on 4 regression tasks in QM9 benchmark.
arXiv Detail & Related papers (2020-06-09T12:13:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.