Related papers: GraphWorld: Fake Graphs Bring Real Insights for GNNs

GraphWorld: Fake Graphs Bring Real Insights for GNNs

URL: http://arxiv.org/abs/2203.00112v1
Date: Mon, 28 Feb 2022 22:00:02 GMT
Title: GraphWorld: Fake Graphs Bring Real Insights for GNNs
Authors: John Palowitch, Anton Tsitsulin, Brandon Mayer, Bryan Perozzi
Abstract summary: GraphWorld allows a user to efficiently generate a world with millions of statistically diverse datasets. We present insights from GraphWorld experiments regarding the performance characteristics of tens of thousands of GNN models over millions of benchmark datasets.
Score: 4.856486822139849
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite advances in the field of Graph Neural Networks (GNNs), only a small number (~5) of datasets are currently used to evaluate new models. This continued reliance on a handful of datasets provides minimal insight into the performance differences between models, and is especially challenging for industrial practitioners who are likely to have datasets which look very different from those used as academic benchmarks. In the course of our work on GNN infrastructure and open-source software at Google, we have sought to develop improved benchmarks that are robust, tunable, scalable,and generalizable. In this work we introduce GraphWorld, a novel methodology and system for benchmarking GNN models on an arbitrarily-large population of synthetic graphs for any conceivable GNN task. GraphWorld allows a user to efficiently generate a world with millions of statistically diverse datasets. It is accessible, scalable, and easy to use. GraphWorld can be run on a single machine without specialized hardware, or it can be easily scaled up to run on arbitrary clusters or cloud frameworks. Using GraphWorld, a user has fine-grained control over graph generator parameters, and can benchmark arbitrary GNN models with built-in hyperparameter tuning. We present insights from GraphWorld experiments regarding the performance characteristics of tens of thousands of GNN models over millions of benchmark datasets. We further show that GraphWorld efficiently explores regions of benchmark dataset space uncovered by standard benchmarks, revealing comparisons between models that have not been historically obtainable. Using GraphWorld, we also are able to study in-detail the relationship between graph properties and task performance metrics, which is nearly impossible with the classic collection of real-world benchmarks.

Related papers

Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs) This framework provides a standardized setting to evaluate GNNs across diverse datasets. We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z)
DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts [70.21017141742763]
Graph neural networks (GNNs) are gaining popularity for processing graph-structured data. Existing methods generally use a fixed number of GNN layers to generate representations for all graphs. We propose the depth adaptive mixture of expert (DA-MoE) method, which incorporates two main improvements to GNN.
arXiv Detail & Related papers (2024-11-05T11:46:27Z)
GraphStorm: all-in-one graph machine learning framework for industry applications [75.23076561638348]
GraphStorm is an end-to-end solution for scalable graph construction, graph model training and inference. Every component in GraphStorm can operate on graphs with billions of nodes and can scale model training and inference to different hardware without changing any code. GraphStorm has been used and deployed for over a dozen billion-scale industry applications after its release in May 2023.
arXiv Detail & Related papers (2024-06-10T04:56:16Z)
Examining the Effects of Degree Distribution and Homophily in Graph Learning Models [19.060710813929354]
GraphWorld is a solution which generates diverse populations of synthetic graphs for benchmarking any GNN task. Despite its success, the SBM imposed fundamental limitations on the kinds of graph structure GraphWorld could create. In this work we examine how two additional synthetic graph generators can improve GraphWorld's evaluation.
arXiv Detail & Related papers (2023-07-17T22:35:46Z)
Graphtester: Exploring Theoretical Boundaries of GNNs on Graph Datasets [10.590698823137755]
We provide a new tool called Graphtester for a comprehensive analysis of the theoretical capabilities of GNNs for various datasets, tasks, and scores. We use Graphtester to analyze over 40 different graph datasets, determining upper bounds on the performance of various GNNs based on the number of layers. We show that the tool can also be used for Graph Transformers using positional node encodings, thereby expanding its scope.
arXiv Detail & Related papers (2023-06-30T08:53:23Z)
Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way. Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z)
NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search [55.75621026447599]
We propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS. Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures. Based on our proposed benchmark, the performance of GNN architectures can be directly obtained by a look-up table without any further computation.
arXiv Detail & Related papers (2022-06-18T10:17:15Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data. Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks. In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z)
Learning to Drop: Robust Graph Neural Network via Topological Denoising [50.81722989898142]
We propose PTDNet, a parameterized topological denoising network, to improve the robustness and generalization performance of Graph Neural Networks (GNNs) PTDNet prunes task-irrelevant edges by penalizing the number of edges in the sparsified graph with parameterized networks. We show that PTDNet can improve the performance of GNNs significantly and the performance gain becomes larger for more noisy datasets.
arXiv Detail & Related papers (2020-11-13T18:53:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.