The Split Matters: Flat Minima Methods for Improving the Performance of
GNNs
- URL: http://arxiv.org/abs/2306.09121v1
- Date: Thu, 15 Jun 2023 13:29:09 GMT
- Title: The Split Matters: Flat Minima Methods for Improving the Performance of
GNNs
- Authors: Nicolas Lell and Ansgar Scherp
- Abstract summary: We investigate flat minima methods and combinations of those methods for training graph neural networks (GNNs)
We conduct experiments on small and large citation, co-purchase, and protein datasets with different train-test splits.
Results show that flat minima methods can improve the performance of GNN models by over 2 points, if the train-test split is randomized.
- Score: 2.9443230571766854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When training a Neural Network, it is optimized using the available training
data with the hope that it generalizes well to new or unseen testing data. At
the same absolute value, a flat minimum in the loss landscape is presumed to
generalize better than a sharp minimum. Methods for determining flat minima
have been mostly researched for independent and identically distributed (i. i.
d.) data such as images. Graphs are inherently non-i. i. d. since the vertices
are edge-connected. We investigate flat minima methods and combinations of
those methods for training graph neural networks (GNNs). We use GCN and GAT as
well as extend Graph-MLP to work with more layers and larger graphs. We conduct
experiments on small and large citation, co-purchase, and protein datasets with
different train-test splits in both the transductive and inductive training
procedure. Results show that flat minima methods can improve the performance of
GNN models by over 2 points, if the train-test split is randomized. Following
Shchur et al., randomized splits are essential for a fair evaluation of GNNs,
as other (fixed) splits like 'Planetoid' are biased. Overall, we provide
important insights for improving and fairly evaluating flat minima methods on
GNNs. We recommend practitioners to always use weight averaging techniques, in
particular EWA when using early stopping. While weight averaging techniques are
only sometimes the best performing method, they are less sensitive to
hyperparameters, need no additional training, and keep the original model
unchanged. All source code is available in
https://github.com/Foisunt/FMMs-in-GNNs.
Related papers
- Classifying Nodes in Graphs without GNNs [50.311528896010785]
We propose a fully GNN-free approach for node classification, not requiring them at train or test time.
Our method consists of three key components: smoothness constraints, pseudo-labeling iterations and neighborhood-label histograms.
arXiv Detail & Related papers (2024-02-08T18:59:30Z) - Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks [2.4175455407547015]
Graph neural networks learn to represent nodes by aggregating information from their neighbors.
Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs.
We introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN.
arXiv Detail & Related papers (2023-10-05T09:08:47Z) - Sharpness-Aware Graph Collaborative Filtering [31.133543641102914]
Graph Neural Networks (GNNs) have achieved impressive performance in collaborative training.
GNNs tend to yield inferior performance when the distributions of and test data are not aligned well.
We propose an effective training schema, called gSAM, under the principle that the textitflatter minima has a better filtering ability than the SAMitsharper ones.
arXiv Detail & Related papers (2023-07-18T01:02:20Z) - Graph Ladling: Shockingly Simple Parallel GNN Training without
Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs.
scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing.
We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z) - Towards Sparsification of Graph Neural Networks [9.568566305616656]
We use two state-of-the-art model compression methods to train and prune and sparse training for the sparsification of weight layers in GNNs.
We evaluate and compare the efficiency of both methods in terms of accuracy, training sparsity, and training FLOPs on real-world graphs.
arXiv Detail & Related papers (2022-09-11T01:39:29Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Scalable Consistency Training for Graph Neural Networks via
Self-Ensemble Self-Distillation [13.815063206114713]
We introduce a novel consistency training method to improve accuracy of graph neural networks (GNNs)
For a target node we generate different neighborhood expansions, and distill the knowledge of the average of the predictions to the GNN.
Our method approximates the expected prediction of the possible neighborhood samples and practically only requires a few samples.
arXiv Detail & Related papers (2021-10-12T19:24:42Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - Combining Label Propagation and Simple Models Out-performs Graph Neural
Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs.
We call this overall procedure Correct and Smooth (C&S)
Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.