Examining the Effects of Degree Distribution and Homophily in Graph
Learning Models
- URL: http://arxiv.org/abs/2307.08881v1
- Date: Mon, 17 Jul 2023 22:35:46 GMT
- Title: Examining the Effects of Degree Distribution and Homophily in Graph
Learning Models
- Authors: Mustafa Yasir, John Palowitch, Anton Tsitsulin, Long Tran-Thanh, Bryan
Perozzi
- Abstract summary: GraphWorld is a solution which generates diverse populations of synthetic graphs for benchmarking any GNN task.
Despite its success, the SBM imposed fundamental limitations on the kinds of graph structure GraphWorld could create.
In this work we examine how two additional synthetic graph generators can improve GraphWorld's evaluation.
- Score: 19.060710813929354
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Despite a surge in interest in GNN development, homogeneity in benchmarking
datasets still presents a fundamental issue to GNN research. GraphWorld is a
recent solution which uses the Stochastic Block Model (SBM) to generate diverse
populations of synthetic graphs for benchmarking any GNN task. Despite its
success, the SBM imposed fundamental limitations on the kinds of graph
structure GraphWorld could create.
In this work we examine how two additional synthetic graph generators can
improve GraphWorld's evaluation; LFR, a well-established model in the graph
clustering literature and CABAM, a recent adaptation of the Barabasi-Albert
model tailored for GNN benchmarking. By integrating these generators, we
significantly expand the coverage of graph space within the GraphWorld
framework while preserving key graph properties observed in real-world
networks. To demonstrate their effectiveness, we generate 300,000 graphs to
benchmark 11 GNN models on a node classification task. We find GNN performance
variations in response to homophily, degree distribution and feature signal.
Based on these findings, we classify models by their sensitivity to the new
generators under these properties. Additionally, we release the extensions made
to GraphWorld on the GitHub repository, offering further evaluation of GNN
performance on new graphs.
Related papers
- Graph Structure Prompt Learning: A Novel Methodology to Improve Performance of Graph Neural Networks [13.655670509818144]
We propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of Graph networks (GNNs)
GPL employs task-independent graph structure losses to encourage GNNs to learn intrinsic graph characteristics while simultaneously solving downstream tasks.
In experiments on eleven real-world datasets, after being trained by neural prediction, GNNs significantly outperform their original performance on node classification, graph classification, and edge tasks.
arXiv Detail & Related papers (2024-07-16T03:59:18Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way.
Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z) - GraphWorld: Fake Graphs Bring Real Insights for GNNs [4.856486822139849]
GraphWorld allows a user to efficiently generate a world with millions of statistically diverse datasets.
We present insights from GraphWorld experiments regarding the performance characteristics of tens of thousands of GNN models over millions of benchmark datasets.
arXiv Detail & Related papers (2022-02-28T22:00:02Z) - Graph Neural Networks for Graphs with Heterophily: A Survey [98.45621222357397]
We provide a comprehensive review of graph neural networks (GNNs) for heterophilic graphs.
Specifically, we propose a systematic taxonomy that essentially governs existing heterophilic GNN models.
We discuss the correlation between graph heterophily and various graph research domains, aiming to facilitate the development of more effective GNNs.
arXiv Detail & Related papers (2022-02-14T23:07:47Z) - Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation [61.39364567221311]
Graph-level anomaly detection (GAD) describes the problem of detecting graphs that are abnormal in their structure and/or the features of their nodes.
One of the challenges in GAD is to devise graph representations that enable the detection of both locally- and globally-anomalous graphs.
We introduce a novel deep anomaly detection approach for GAD that learns rich global and local normal pattern information by joint random distillation of graph and node representations.
arXiv Detail & Related papers (2021-12-19T05:04:53Z) - Imbalanced Graph Classification via Graph-of-Graph Neural Networks [16.589373163769853]
Graph Neural Networks (GNNs) have achieved unprecedented success in learning graph representations to identify categorical labels of graphs.
We introduce a novel framework, Graph-of-Graph Neural Networks (G$2$GNN), which alleviates the graph imbalance issue by deriving extra supervision globally from neighboring graphs and locally from graphs themselves.
Our proposed G$2$GNN outperforms numerous baselines by roughly 5% in both F1-macro and F1-micro scores.
arXiv Detail & Related papers (2021-12-01T02:25:47Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Beyond Low-Pass Filters: Adaptive Feature Propagation on Graphs [6.018995094882323]
Graph neural networks (GNNs) have been extensively studied for prediction tasks on graphs.
Most GNNs assume local homophily, i.e., strong similarities in localneighborhoods.
We propose a flexible GNN model, which is capable of handling any graphs without beingrestricted by their underlying homophily.
arXiv Detail & Related papers (2021-03-26T00:35:36Z) - Learning to Drop: Robust Graph Neural Network via Topological Denoising [50.81722989898142]
We propose PTDNet, a parameterized topological denoising network, to improve the robustness and generalization performance of Graph Neural Networks (GNNs)
PTDNet prunes task-irrelevant edges by penalizing the number of edges in the sparsified graph with parameterized networks.
We show that PTDNet can improve the performance of GNNs significantly and the performance gain becomes larger for more noisy datasets.
arXiv Detail & Related papers (2020-11-13T18:53:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.