Masked Graph Transformer for Large-Scale Recommendation
- URL: http://arxiv.org/abs/2405.04028v1
- Date: Tue, 07 May 2024 06:00:47 GMT
- Title: Masked Graph Transformer for Large-Scale Recommendation
- Authors: Huiyuan Chen, Zhe Xu, Chin-Chia Michael Yeh, Vivian Lai, Yan Zheng, Minghua Xu, Hanghang Tong,
- Abstract summary: We propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity.
Experimental results show the superior performance of our MGFormer, even with a single attention layer.
- Score: 56.37903431721977
- License:
- Abstract: Graph Transformers have garnered significant attention for learning graph-structured data, thanks to their superb ability to capture long-range dependencies among nodes. However, the quadratic space and time complexity hinders the scalability of Graph Transformers, particularly for large-scale recommendation. Here we propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity. To achieve this, we treat all user/item nodes as independent tokens, enhance them with positional embeddings, and feed them into a kernelized attention module. Additionally, we incorporate learnable relative degree information to appropriately reweigh the attentions. Experimental results show the superior performance of our MGFormer, even with a single attention layer.
Related papers
- SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs.
We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning.
It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z) - SpikeGraphormer: A High-Performance Graph Transformer with Spiking Graph Attention [1.4126245676224705]
Graph Transformers have emerged as a promising solution to alleviate the inherent limitations of Graph Neural Networks (GNNs)
We propose a novel insight into integrating SNNs with Graph Transformers and design a Spiking Graph Attention (SGA) module.
SpikeGraphormer consistently outperforms existing state-of-the-art approaches across various datasets.
arXiv Detail & Related papers (2024-03-21T03:11:53Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - AGFormer: Efficient Graph Representation with Anchor-Graph Transformer [95.1825252182316]
We propose a novel graph Transformer architecture, termed Anchor Graph Transformer (AGFormer)
AGFormer first obtains some representative anchors and then converts node-to-node message passing into anchor-to-anchor and anchor-to-node message passing process.
Extensive experiments on several benchmark datasets demonstrate the effectiveness and benefits of proposed AGFormer.
arXiv Detail & Related papers (2023-05-12T14:35:42Z) - NAGphormer: Neighborhood Aggregation Graph Transformer for Node
Classification in Large Graphs [10.149586598073421]
We propose a Neighborhood Aggregation Graph Transformer (NAGphormer) that is scalable to large graphs with millions of nodes.
NAGphormer constructs tokens for each node by a neighborhood aggregation module called Hop2Token.
We conduct extensive experiments on various popular benchmarks, including six small datasets and three large datasets.
arXiv Detail & Related papers (2022-06-10T07:23:51Z) - What Dense Graph Do You Need for Self-Attention? [73.82686008622596]
We present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer.
Experiments on tasks requiring various sequence lengths lay validation for our graph function well.
arXiv Detail & Related papers (2022-05-27T14:36:55Z) - MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data.
Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.