Related papers: TUDataset: A collection of benchmark datasets for learning with graphs

TUDataset: A collection of benchmark datasets for learning with graphs

URL: http://arxiv.org/abs/2007.08663v1
Date: Thu, 16 Jul 2020 21:46:33 GMT
Title: TUDataset: A collection of benchmark datasets for learning with graphs
Authors: Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, Marion Neumann
Abstract summary: We introduce the TUDataset for graph classification and regression. The collection consists of over 120 datasets of varying sizes from a wide range of applications. All datasets are available at www.graphlearning.io.
Score: 21.16723995518478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, there has been an increasing interest in (supervised) learning with graph data, especially using graph neural networks. However, the development of meaningful benchmark datasets and standardized evaluation procedures is lagging, consequently hindering advancements in this area. To address this, we introduce the TUDataset for graph classification and regression. The collection consists of over 120 datasets of varying sizes from a wide range of applications. We provide Python-based data loaders, kernel and graph neural network baseline implementations, and evaluation tools. Here, we give an overview of the datasets, standardized evaluation procedures, and provide baseline experiments. All datasets are available at www.graphlearning.io. The experiments are fully reproducible from the code available at www.github.com/chrsmrrs/tudataset.

Related papers

No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets [16.040478092904632]
We introduce RINGS, a flexible and mode-perturbation framework to assess the quality of graph-learning datasets. Within this framework, we propose two measures -- performance separability and mode complementarity -- as evaluation tools. We derive actionable recommendations for improving the evaluation of graph-learning methods.
arXiv Detail & Related papers (2025-02-04T14:59:03Z)
Temporal Graph Benchmark for Machine Learning on Temporal Graphs [54.52243310226456]
Temporal Graph Benchmark (TGB) is a collection of challenging and diverse benchmark datasets. We benchmark each dataset and find that the performance of common models can vary drastically across datasets. TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research.
arXiv Detail & Related papers (2023-07-03T13:58:20Z)
Exploring Data Redundancy in Real-world Image Classification through Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs. We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data. Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z)
Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph Representation Learning [30.23894624193583]
Graph Neural Networks (GNNs) training upon non-Euclidean graph data often encounters relatively higher time costs. We develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data.
arXiv Detail & Related papers (2022-10-01T01:47:00Z)
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data. The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN. Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z)
Graph-level Neural Networks: Current Progress and Future Directions [61.08696673768116]
Graph-level Neural Networks (GLNNs, deep learning-based graph-level learning methods) have been attractive due to their superiority in modeling high-dimensional data. We propose a systematic taxonomy covering GLNNs upon deep neural networks, graph neural networks, and graph pooling.
arXiv Detail & Related papers (2022-05-31T06:16:55Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
Graph Contrastive Learning with Augmentations [109.23158429991298]
We propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data. We show that our framework can produce graph representations of similar or better generalizability, transferrability, and robustness compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-22T20:13:43Z)
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks [0.0]
We present Wiki-CS, a novel dataset derived from Wikipedia for benchmarking Graph Neural Networks. The dataset consists of nodes corresponding to Computer Science articles, with edges based on hyperlinks and 10 classes representing different branches of the field. We use the dataset to evaluate semi-supervised node classification and single-relation link prediction models.
arXiv Detail & Related papers (2020-07-06T17:25:47Z)
Open Graph Benchmark: Datasets for Machine Learning on Graphs [86.96887552203479]
We present the Open Graph Benchmark (OGB) to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics.
arXiv Detail & Related papers (2020-05-02T03:09:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.