OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
- URL: http://arxiv.org/abs/2103.09430v1
- Date: Wed, 17 Mar 2021 04:08:03 GMT
- Title: OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
- Authors: Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure
Leskovec
- Abstract summary: OGB Large-Scale Challenge (OGB-LSC) is a collection of three real-world datasets for advancing the state-of-the-art in large-scale graph ML.
OGB-LSC provides dedicated baseline experiments, scaling up expressive graph ML models to the massive datasets.
- Score: 69.23600404232883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Enabling effective and efficient machine learning (ML) over large-scale graph
data (e.g., graphs with billions of edges) can have a huge impact on both
industrial and scientific applications. However, community efforts to advance
large-scale graph ML have been severely limited by the lack of a suitable
public benchmark. For KDD Cup 2021, we present OGB Large-Scale Challenge
(OGB-LSC), a collection of three real-world datasets for advancing the
state-of-the-art in large-scale graph ML. OGB-LSC provides graph datasets that
are orders of magnitude larger than existing ones and covers three core graph
learning tasks -- link prediction, graph regression, and node classification.
Furthermore, OGB-LSC provides dedicated baseline experiments, scaling up
expressive graph ML models to the massive datasets. We show that the expressive
models significantly outperform simple scalable baselines, indicating an
opportunity for dedicated efforts to further improve graph ML at scale. Our
datasets and baseline code are released and maintained as part of our OGB
initiative (Hu et al., 2020). We hope OGB-LSC at KDD Cup 2021 can empower the
community to discover innovative solutions for large-scale graph ML.
Related papers
- What Can We Learn from State Space Models for Machine Learning on Graphs? [11.38076877943004]
We propose Graph State Space Convolution (GSSC) as a principled extension of State Space Models (SSMs) to graph-structured data.
By leveraging global permutation-equivariant set aggregation and factorizable graph kernels, GSSC preserves all three advantages of SSMs.
Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning.
arXiv Detail & Related papers (2024-06-09T15:03:36Z) - LLaGA: Large Language and Graph Assistant [73.71990472543027]
Large Language and Graph Assistant (LLaGA) is an innovative model to handle the complexities of graph-structured data.
LLaGA excels in versatility, generalizability and interpretability, allowing it to perform consistently well across different datasets and tasks.
Our experiments show that LLaGA delivers outstanding performance across four datasets and three tasks using one single model.
arXiv Detail & Related papers (2024-02-13T02:03:26Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size
of Public Graph Datasets for Deep Learning Research [14.191338008898963]
Graph neural networks (GNNs) have shown high potential for a variety of real-world, challenging applications.
One of the major obstacles in GNN research is the lack of large-scale flexible datasets.
We introduce the Illinois Graph Benchmark (IGB), a research dataset tool that the developers can use to train, scrutinize and evaluate GNN models.
arXiv Detail & Related papers (2023-02-27T05:21:35Z) - Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way.
Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z) - Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph.
In this work, we experiment with the use of graph summarization techniques to compress the graph.
We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z) - Large-scale graph representation learning with very deep GNNs and
self-supervision [17.887767916020774]
We show how to deploy graph neural networks (GNNs) at scale using the Open Graph Benchmark Large-Scale Challenge (OGB-LSC)
Our models achieved an award-level (top-3) performance on both the MAG240M and PCQM4M benchmarks.
arXiv Detail & Related papers (2021-07-20T11:35:25Z) - Open Graph Benchmark: Datasets for Machine Learning on Graphs [86.96887552203479]
We present the Open Graph Benchmark (OGB) to facilitate scalable, robust, and reproducible graph machine learning (ML) research.
OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains.
For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics.
arXiv Detail & Related papers (2020-05-02T03:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.