GraphWorld: Fake Graphs Bring Real Insights for GNNs
- URL: http://arxiv.org/abs/2203.00112v1
- Date: Mon, 28 Feb 2022 22:00:02 GMT
- Title: GraphWorld: Fake Graphs Bring Real Insights for GNNs
- Authors: John Palowitch, Anton Tsitsulin, Brandon Mayer, Bryan Perozzi
- Abstract summary: GraphWorld allows a user to efficiently generate a world with millions of statistically diverse datasets.
We present insights from GraphWorld experiments regarding the performance characteristics of tens of thousands of GNN models over millions of benchmark datasets.
- Score: 4.856486822139849
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite advances in the field of Graph Neural Networks (GNNs), only a small
number (~5) of datasets are currently used to evaluate new models. This
continued reliance on a handful of datasets provides minimal insight into the
performance differences between models, and is especially challenging for
industrial practitioners who are likely to have datasets which look very
different from those used as academic benchmarks. In the course of our work on
GNN infrastructure and open-source software at Google, we have sought to
develop improved benchmarks that are robust, tunable, scalable,and
generalizable. In this work we introduce GraphWorld, a novel methodology and
system for benchmarking GNN models on an arbitrarily-large population of
synthetic graphs for any conceivable GNN task. GraphWorld allows a user to
efficiently generate a world with millions of statistically diverse datasets.
It is accessible, scalable, and easy to use. GraphWorld can be run on a single
machine without specialized hardware, or it can be easily scaled up to run on
arbitrary clusters or cloud frameworks. Using GraphWorld, a user has
fine-grained control over graph generator parameters, and can benchmark
arbitrary GNN models with built-in hyperparameter tuning. We present insights
from GraphWorld experiments regarding the performance characteristics of tens
of thousands of GNN models over millions of benchmark datasets. We further show
that GraphWorld efficiently explores regions of benchmark dataset space
uncovered by standard benchmarks, revealing comparisons between models that
have not been historically obtainable. Using GraphWorld, we also are able to
study in-detail the relationship between graph properties and task performance
metrics, which is nearly impossible with the classic collection of real-world
benchmarks.
Related papers
- DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts [70.21017141742763]
Graph neural networks (GNNs) are gaining popularity for processing graph-structured data.
Existing methods generally use a fixed number of GNN layers to generate representations for all graphs.
We propose the depth adaptive mixture of expert (DA-MoE) method, which incorporates two main improvements to GNN.
arXiv Detail & Related papers (2024-11-05T11:46:27Z) - GraphStorm: all-in-one graph machine learning framework for industry applications [75.23076561638348]
GraphStorm is an end-to-end solution for scalable graph construction, graph model training and inference.
Every component in GraphStorm can operate on graphs with billions of nodes and can scale model training and inference to different hardware without changing any code.
GraphStorm has been used and deployed for over a dozen billion-scale industry applications after its release in May 2023.
arXiv Detail & Related papers (2024-06-10T04:56:16Z) - Examining the Effects of Degree Distribution and Homophily in Graph
Learning Models [19.060710813929354]
GraphWorld is a solution which generates diverse populations of synthetic graphs for benchmarking any GNN task.
Despite its success, the SBM imposed fundamental limitations on the kinds of graph structure GraphWorld could create.
In this work we examine how two additional synthetic graph generators can improve GraphWorld's evaluation.
arXiv Detail & Related papers (2023-07-17T22:35:46Z) - Graphtester: Exploring Theoretical Boundaries of GNNs on Graph Datasets [10.590698823137755]
We provide a new tool called Graphtester for a comprehensive analysis of the theoretical capabilities of GNNs for various datasets, tasks, and scores.
We use Graphtester to analyze over 40 different graph datasets, determining upper bounds on the performance of various GNNs based on the number of layers.
We show that the tool can also be used for Graph Transformers using positional node encodings, thereby expanding its scope.
arXiv Detail & Related papers (2023-06-30T08:53:23Z) - Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way.
Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z) - NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search [55.75621026447599]
We propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS.
Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures.
Based on our proposed benchmark, the performance of GNN architectures can be directly obtained by a look-up table without any further computation.
arXiv Detail & Related papers (2022-06-18T10:17:15Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data.
Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks.
In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z) - Learning to Drop: Robust Graph Neural Network via Topological Denoising [50.81722989898142]
We propose PTDNet, a parameterized topological denoising network, to improve the robustness and generalization performance of Graph Neural Networks (GNNs)
PTDNet prunes task-irrelevant edges by penalizing the number of edges in the sparsified graph with parameterized networks.
We show that PTDNet can improve the performance of GNNs significantly and the performance gain becomes larger for more noisy datasets.
arXiv Detail & Related papers (2020-11-13T18:53:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.