TUDataset: A collection of benchmark datasets for learning with graphs
- URL: http://arxiv.org/abs/2007.08663v1
- Date: Thu, 16 Jul 2020 21:46:33 GMT
- Title: TUDataset: A collection of benchmark datasets for learning with graphs
- Authors: Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting,
Petra Mutzel, Marion Neumann
- Abstract summary: We introduce the TUDataset for graph classification and regression.
The collection consists of over 120 datasets of varying sizes from a wide range of applications.
All datasets are available at www.graphlearning.io.
- Score: 21.16723995518478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, there has been an increasing interest in (supervised) learning with
graph data, especially using graph neural networks. However, the development of
meaningful benchmark datasets and standardized evaluation procedures is
lagging, consequently hindering advancements in this area. To address this, we
introduce the TUDataset for graph classification and regression. The collection
consists of over 120 datasets of varying sizes from a wide range of
applications. We provide Python-based data loaders, kernel and graph neural
network baseline implementations, and evaluation tools. Here, we give an
overview of the datasets, standardized evaluation procedures, and provide
baseline experiments. All datasets are available at www.graphlearning.io. The
experiments are fully reproducible from the code available at
www.github.com/chrsmrrs/tudataset.
Related papers
- Temporal Graph Benchmark for Machine Learning on Temporal Graphs [54.52243310226456]
Temporal Graph Benchmark (TGB) is a collection of challenging and diverse benchmark datasets.
We benchmark each dataset and find that the performance of common models can vary drastically across datasets.
TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research.
arXiv Detail & Related papers (2023-07-03T13:58:20Z) - Exploring Data Redundancy in Real-world Image Classification through
Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs.
We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data.
Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z) - Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph
Representation Learning [30.23894624193583]
Graph Neural Networks (GNNs) training upon non-Euclidean graph data often encounters relatively higher time costs.
We develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data.
arXiv Detail & Related papers (2022-10-01T01:47:00Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - Graph-level Neural Networks: Current Progress and Future Directions [61.08696673768116]
Graph-level Neural Networks (GLNNs, deep learning-based graph-level learning methods) have been attractive due to their superiority in modeling high-dimensional data.
We propose a systematic taxonomy covering GLNNs upon deep neural networks, graph neural networks, and graph pooling.
arXiv Detail & Related papers (2022-05-31T06:16:55Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Graph Contrastive Learning with Augmentations [109.23158429991298]
We propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.
We show that our framework can produce graph representations of similar or better generalizability, transferrability, and robustness compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-22T20:13:43Z) - Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks [0.0]
We present Wiki-CS, a novel dataset derived from Wikipedia for benchmarking Graph Neural Networks.
The dataset consists of nodes corresponding to Computer Science articles, with edges based on hyperlinks and 10 classes representing different branches of the field.
We use the dataset to evaluate semi-supervised node classification and single-relation link prediction models.
arXiv Detail & Related papers (2020-07-06T17:25:47Z) - Open Graph Benchmark: Datasets for Machine Learning on Graphs [86.96887552203479]
We present the Open Graph Benchmark (OGB) to facilitate scalable, robust, and reproducible graph machine learning (ML) research.
OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains.
For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics.
arXiv Detail & Related papers (2020-05-02T03:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.