A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
- URL: http://arxiv.org/abs/2402.02464v3
- Date: Wed, 29 May 2024 05:40:35 GMT
- Title: A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
- Authors: Zhangyang Gao, Daize Dong, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li,
- Abstract summary: We introduce GraphsGPT, featuring a Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words.
A GraphGPT decoder reconstructs the original graph from Graph Words to ensure information equivalence.
- Score: 47.25114679486907
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.
Related papers
- G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs.
We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs.
G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z) - An Accurate Graph Generative Model with Tunable Features [0.8192907805418583]
We propose a method to improve the accuracy of GraphTune by adding a new mechanism to feed back errors of graph features.
Experiments on a real-world graph dataset showed that the features in the generated graphs are accurately tuned compared with conventional models.
arXiv Detail & Related papers (2023-09-03T12:34:15Z) - G-Mixup: Graph Data Augmentation for Graph Classification [55.63157775049443]
Mixup has shown superiority in improving the generalization and robustness of neural networks by interpolating features and labels between two random samples.
We propose $mathcalG$-Mixup to augment graphs for graph classification by interpolating the generator (i.e., graphon) of different classes of graphs.
Experiments show that $mathcalG$-Mixup substantially improves the generalization and robustness of GNNs.
arXiv Detail & Related papers (2022-02-15T04:09:44Z) - Inference Attacks Against Graph Neural Networks [33.19531086886817]
Graph embedding is a powerful tool to solve the graph analytics problem.
While sharing graph embedding is intriguing, the associated privacy risks are unexplored.
We systematically investigate the information leakage of the graph embedding by mounting three inference attacks.
arXiv Detail & Related papers (2021-10-06T10:08:11Z) - GraphGen-Redux: a Fast and Lightweight Recurrent Model for labeled Graph
Generation [13.956691231452336]
We present a novel graph preprocessing approach for labeled graph generation.
By introducing a novel graph preprocessing approach, we are able to process the labeling information of both nodes and edges jointly.
The corresponding model, which we term GraphGen-Redux, improves upon the generative performances of GraphGen in a wide range of datasets.
arXiv Detail & Related papers (2021-07-18T09:26:10Z) - Dirichlet Graph Variational Autoencoder [65.94744123832338]
We present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster memberships as latent factors.
Motivated by the low pass characteristics in balanced graph cut, we propose a new variant of GNN named Heatts to encode the input graph into cluster memberships.
arXiv Detail & Related papers (2020-10-09T07:35:26Z) - Contrastive Self-supervised Learning for Graph Classification [21.207647143672585]
We propose two approaches based on contrastive self-supervised learning (CSSL) to alleviate overfitting.
In the first approach, we use CSSL to pretrain graph encoders on widely-available unlabeled graphs without relying on human-provided labels.
In the second approach, we develop a regularizer based on CSSL, and solve the supervised classification task and the unsupervised CSSL task simultaneously.
arXiv Detail & Related papers (2020-09-13T05:12:55Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z) - Line Hypergraph Convolution Network: Applying Graph Convolution for
Hypergraphs [18.7475578342125]
We propose a novel technique to apply graph convolution on hypergraphs with variable hyperedge sizes.
We use the classical concept of line graph of a hypergraph for the first time in the hypergraph learning literature.
arXiv Detail & Related papers (2020-02-09T16:05:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.