Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking
- URL: http://arxiv.org/abs/2306.10453v3
- Date: Sat, 18 Nov 2023 19:03:50 GMT
- Title: Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking
- Authors: Juanhui Li, Harry Shomer, Haitao Mao, Shenglai Zeng, Yao Ma, Neil
Shah, Jiliang Tang, Dawei Yin
- Abstract summary: Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
- Score: 66.83273589348758
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Link prediction attempts to predict whether an unseen edge exists based on
only a portion of edges of a graph. A flurry of methods have been introduced in
recent years that attempt to make use of graph neural networks (GNNs) for this
task. Furthermore, new and diverse datasets have also been created to better
evaluate the effectiveness of these new models. However, multiple pitfalls
currently exist that hinder our ability to properly evaluate these new methods.
These pitfalls mainly include: (1) Lower than actual performance on multiple
baselines, (2) A lack of a unified data split and evaluation metric on some
datasets, and (3) An unrealistic evaluation setting that uses easy negative
samples. To overcome these challenges, we first conduct a fair comparison
across prominent methods and datasets, utilizing the same dataset and
hyperparameter search settings. We then create a more practical evaluation
setting based on a Heuristic Related Sampling Technique (HeaRT), which samples
hard negative samples via multiple heuristics. The new evaluation setting helps
promote new challenges and opportunities in link prediction by aligning the
evaluation with real-world situations. Our implementation and data are
available at https://github.com/Juanhui28/HeaRT
Related papers
- New Perspectives on the Evaluation of Link Prediction Algorithms for
Dynamic Graphs [12.987894327817159]
We introduce novel visualization methods that can yield insight into prediction performance and the dynamics of temporal networks.
We validate empirically, on datasets extracted from recent benchmarks, that the error is typically not evenly distributed across different data segments.
arXiv Detail & Related papers (2023-11-30T11:57:07Z) - Towards Mitigating more Challenging Spurious Correlations: A Benchmark & New Datasets [43.64631697043496]
Deep neural networks often exploit non-predictive features that are spuriously correlated with class labels.
Despite the growing body of recent works on remedying spurious correlations, the lack of a standardized benchmark hinders reproducible evaluation.
We present SpuCo, a python package with modular implementations of state-of-the-art solutions enabling easy and reproducible evaluation.
arXiv Detail & Related papers (2023-06-21T00:59:06Z) - From Spectral Graph Convolutions to Large Scale Graph Convolutional
Networks [0.0]
Graph Convolutional Networks (GCNs) have been shown to be a powerful concept that has been successfully applied to a large variety of tasks.
We study the theory that paved the way to the definition of GCN, including related parts of classical graph theory.
arXiv Detail & Related papers (2022-07-12T16:57:08Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - SCE: Scalable Network Embedding from Sparsest Cut [20.08464038805681]
Large-scale network embedding is to learn a latent representation for each node in an unsupervised manner.
A key of success to such contrastive learning methods is how to draw positive and negative samples.
In this paper, we propose SCE for unsupervised network embedding only using negative samples for training.
arXiv Detail & Related papers (2020-06-30T03:18:15Z) - Evaluating Models' Local Decision Boundaries via Contrast Sets [119.38387782979474]
We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data.
We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets.
Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets.
arXiv Detail & Related papers (2020-04-06T14:47:18Z) - Frustratingly Simple Few-Shot Object Detection [98.42824677627581]
We find that fine-tuning only the last layer of existing detectors on rare classes is crucial to the few-shot object detection task.
Such a simple approach outperforms the meta-learning methods by roughly 220 points on current benchmarks.
arXiv Detail & Related papers (2020-03-16T00:29:14Z) - PushNet: Efficient and Adaptive Neural Message Passing [1.9121961872220468]
Message passing neural networks have recently evolved into a state-of-the-art approach to representation learning on graphs.
Existing methods perform synchronous message passing along all edges in multiple subsequent rounds.
We consider a novel asynchronous message passing approach where information is pushed only along the most relevant edges until convergence.
arXiv Detail & Related papers (2020-03-04T18:15:30Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - Benchmarking Network Embedding Models for Link Prediction: Are We Making
Progress? [84.43405961569256]
We shed light on the state-of-the-art of network embedding methods for link prediction.
We show, using a consistent evaluation pipeline, that only thin progress has been made over the last years.
We argue that standardized evaluation tools can repair this situation and boost future progress in this field.
arXiv Detail & Related papers (2020-02-25T16:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.