Related papers: Masked Graph Transformer for Large-Scale Recommendation

Masked Graph Transformer for Large-Scale Recommendation

URL: http://arxiv.org/abs/2405.04028v1
Date: Tue, 07 May 2024 06:00:47 GMT
Title: Masked Graph Transformer for Large-Scale Recommendation
Authors: Huiyuan Chen, Zhe Xu, Chin-Chia Michael Yeh, Vivian Lai, Yan Zheng, Minghua Xu, Hanghang Tong,
Abstract summary: We propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity. Experimental results show the superior performance of our MGFormer, even with a single attention layer.
Score: 56.37903431721977
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph Transformers have garnered significant attention for learning graph-structured data, thanks to their superb ability to capture long-range dependencies among nodes. However, the quadratic space and time complexity hinders the scalability of Graph Transformers, particularly for large-scale recommendation. Here we propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity. To achieve this, we treat all user/item nodes as independent tokens, enhance them with positional embeddings, and feed them into a kernelized attention module. Additionally, we incorporate learnable relative degree information to appropriately reweigh the attentions. Experimental results show the superior performance of our MGFormer, even with a single attention layer.

Related papers

SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs. We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning. It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z)
SpikeGraphormer: A High-Performance Graph Transformer with Spiking Graph Attention [1.4126245676224705]
Graph Transformers have emerged as a promising solution to alleviate the inherent limitations of Graph Neural Networks (GNNs) We propose a novel insight into integrating SNNs with Graph Transformers and design a Spiking Graph Attention (SGA) module. SpikeGraphormer consistently outperforms existing state-of-the-art approaches across various datasets.
arXiv Detail & Related papers (2024-03-21T03:11:53Z)
SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model. We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z)
AGFormer: Efficient Graph Representation with Anchor-Graph Transformer [95.1825252182316]
We propose a novel graph Transformer architecture, termed Anchor Graph Transformer (AGFormer) AGFormer first obtains some representative anchors and then converts node-to-node message passing into anchor-to-anchor and anchor-to-node message passing process. Extensive experiments on several benchmark datasets demonstrate the effectiveness and benefits of proposed AGFormer.
arXiv Detail & Related papers (2023-05-12T14:35:42Z)
NAGphormer: Neighborhood Aggregation Graph Transformer for Node Classification in Large Graphs [10.149586598073421]
We propose a Neighborhood Aggregation Graph Transformer (NAGphormer) that is scalable to large graphs with millions of nodes. NAGphormer constructs tokens for each node by a neighborhood aggregation module called Hop2Token. We conduct extensive experiments on various popular benchmarks, including six small datasets and three large datasets.
arXiv Detail & Related papers (2022-06-10T07:23:51Z)
What Dense Graph Do You Need for Self-Attention? [73.82686008622596]
We present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer. Experiments on tasks requiring various sequence lengths lay validation for our graph function well.
arXiv Detail & Related papers (2022-05-27T14:36:55Z)
MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data. Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.