Related papers: OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs

OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs

URL: http://arxiv.org/abs/2103.09430v1
Date: Wed, 17 Mar 2021 04:08:03 GMT
Title: OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
Authors: Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec
Abstract summary: OGB Large-Scale Challenge (OGB-LSC) is a collection of three real-world datasets for advancing the state-of-the-art in large-scale graph ML. OGB-LSC provides dedicated baseline experiments, scaling up expressive graph ML models to the massive datasets.
Score: 69.23600404232883
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Enabling effective and efficient machine learning (ML) over large-scale graph data (e.g., graphs with billions of edges) can have a huge impact on both industrial and scientific applications. However, community efforts to advance large-scale graph ML have been severely limited by the lack of a suitable public benchmark. For KDD Cup 2021, we present OGB Large-Scale Challenge (OGB-LSC), a collection of three real-world datasets for advancing the state-of-the-art in large-scale graph ML. OGB-LSC provides graph datasets that are orders of magnitude larger than existing ones and covers three core graph learning tasks -- link prediction, graph regression, and node classification. Furthermore, OGB-LSC provides dedicated baseline experiments, scaling up expressive graph ML models to the massive datasets. We show that the expressive models significantly outperform simple scalable baselines, indicating an opportunity for dedicated efforts to further improve graph ML at scale. Our datasets and baseline code are released and maintained as part of our OGB initiative (Hu et al., 2020). We hope OGB-LSC at KDD Cup 2021 can empower the community to discover innovative solutions for large-scale graph ML.

Related papers

G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning [58.73279333365234]
Reinforcement Learning (RL) on synthetic graph-theoretic tasks can significantly scale graph reasoning abilities.<n>With RL on Erdos, G1 obtains substantial improvements in graph reasoning, where our finetuned 3B model even outperforms Qwen2.5-72B-Instruct (24x size)<n>Our findings offer an efficient, scalable path for building strong graph reasoners by finetuning LLMs with RL on graph-theoretic tasks.
arXiv Detail & Related papers (2025-05-24T04:33:41Z)
Scalable Graph Generative Modeling via Substructure Sequences [37.64864614356634]
We introduce Generative Graph Pattern Machine (G$2$PM), a generative Transformer pre-training framework for graphs.<n>G$2$PM represents graph instances as sequences of substructures, and employs generative pre-training over the sequences to learn generalizable, transferable representations.<n>On the ogbn-arxiv benchmark, G$2$PM continues to improve with model sizes up to 60M parameters, outperforming prior generative approaches that plateau at significantly smaller scales.
arXiv Detail & Related papers (2025-05-22T02:16:34Z)
Scale-Free Graph-Language Models [44.283149785253286]
Graph-language models (GLMs) have demonstrated great potential in graph-based semi-supervised learning. This paper introduces a novel GLM that integrates graph generation and text embedding within a unified framework.
arXiv Detail & Related papers (2025-02-21T03:41:43Z)
What Can We Learn from State Space Models for Machine Learning on Graphs? [11.38076877943004]
We propose Graph State Space Convolution (GSSC) as a principled extension of State Space Models (SSMs) to graph-structured data. By leveraging global permutation-equivariant set aggregation and factorizable graph kernels, GSSC preserves all three advantages of SSMs. Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning.
arXiv Detail & Related papers (2024-06-09T15:03:36Z)
LLaGA: Large Language and Graph Assistant [73.71990472543027]
Large Language and Graph Assistant (LLaGA) is an innovative model to handle the complexities of graph-structured data. LLaGA excels in versatility, generalizability and interpretability, allowing it to perform consistently well across different datasets and tasks. Our experiments show that LLaGA delivers outstanding performance across four datasets and three tasks using one single model.
arXiv Detail & Related papers (2024-02-13T02:03:26Z)
Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints. A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism. We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z)
IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research [14.191338008898963]
Graph neural networks (GNNs) have shown high potential for a variety of real-world, challenging applications. One of the major obstacles in GNN research is the lack of large-scale flexible datasets. We introduce the Illinois Graph Benchmark (IGB), a research dataset tool that the developers can use to train, scrutinize and evaluate GNN models.
arXiv Detail & Related papers (2023-02-27T05:21:35Z)
Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way. Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z)
Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph. In this work, we experiment with the use of graph summarization techniques to compress the graph. We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z)
Large-scale graph representation learning with very deep GNNs and self-supervision [17.887767916020774]
We show how to deploy graph neural networks (GNNs) at scale using the Open Graph Benchmark Large-Scale Challenge (OGB-LSC) Our models achieved an award-level (top-3) performance on both the MAG240M and PCQM4M benchmarks.
arXiv Detail & Related papers (2021-07-20T11:35:25Z)
Open Graph Benchmark: Datasets for Machine Learning on Graphs [86.96887552203479]
We present the Open Graph Benchmark (OGB) to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics.
arXiv Detail & Related papers (2020-05-02T03:09:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.