GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research
- URL: http://arxiv.org/abs/2510.24035v1
- Date: Tue, 28 Oct 2025 03:36:05 GMT
- Title: GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research
- Authors: Xinqi Li, Yiqun Liu, Shan Jiang, Enrong Zheng, Huaijin Zheng, Wenhao Dai, Haodong Deng, Dianhai Yu, Yanjun Ma,
- Abstract summary: GraphNet is a dataset of 2.7K real-world deep learning computational graphs with rich metadata.<n>We benchmark the default tensor compilers, CINN for PaddlePaddle and TorchInductor for PyTorch, on computer vision (CV) and natural language processing (NLP) samples to demonstrate the practicality of GraphNet.
- Score: 8.344163643041753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce GraphNet, a dataset of 2.7K real-world deep learning computational graphs with rich metadata, spanning six major task categories across multiple deep learning frameworks. To evaluate tensor compiler performance on these samples, we propose the benchmark metric Speedup Score S(t), which jointly considers runtime speedup and execution correctness under tunable tolerance levels, offering a reliable measure of general optimization capability. Furthermore, we extend S(t) to the Error-aware Speedup Score ES(t), which incorporates error information and helps compiler developers identify key performance bottlenecks. In this report, we benchmark the default tensor compilers, CINN for PaddlePaddle and TorchInductor for PyTorch, on computer vision (CV) and natural language processing (NLP) samples to demonstrate the practicality of GraphNet. The full construction pipeline with graph extraction and compiler evaluation tools is available at https://github.com/PaddlePaddle/GraphNet .
Related papers
- Semi-supervised Instruction Tuning for Large Language Models on Text-Attributed Graphs [62.544129365882014]
We propose a novel Semi-supervised Instruction Tuning pipeline for Graph Learning, named SIT-Graph.<n> SIT-Graph is model-agnostic and can be seamlessly integrated into any graph instruction tuning method that utilizes LLMs as the predictor.<n>Extensive experiments demonstrate that when incorporated into state-of-the-art graph instruction tuning methods, SIT-Graph significantly enhances their performance on text-attributed graph benchmarks.
arXiv Detail & Related papers (2026-01-19T08:10:53Z) - Slicing Input Features to Accelerate Deep Learning: A Case Study with Graph Neural Networks [0.24578723416255746]
This paper introduces SliceGCN, a feature-sliced distributed large-scale graph learning method.
It aims to avoid accuracy loss typically associated with mini-batch training and to reduce inter- GPU communication.
Experiments were conducted on six node classification datasets, yielding some interesting analytical results.
arXiv Detail & Related papers (2024-08-21T10:18:41Z) - TpuGraphs: A Performance Prediction Dataset on Large Tensor
Computational Graphs [24.790481918123103]
This paper introduces TpuGraphs, a performance prediction dataset on full tensor programs.
Each graph in the dataset represents the main computation of a machine learning workload.
TpuGraphs provides 25x more graphs than the largest graph property prediction dataset.
arXiv Detail & Related papers (2023-08-25T17:04:35Z) - INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order
Gradient Computations in Implicit Neural Representation Processing [66.00729477511219]
Given a function represented as a computation graph, traditional architectures face challenges in efficiently computing its nth-order gradient.
We introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture.
We present results that demonstrate 1.8-4.8x and 1.5-3.6x speedup compared to CPU and GPU baselines respectively.
arXiv Detail & Related papers (2023-08-11T04:24:39Z) - Sparsity exploitation via discovering graphical models in multi-variate
time-series forecasting [1.2762298148425795]
We propose a decoupled training method, which includes a graph generating module and a GNNs forecasting module.
First, we use Graphical Lasso (or GraphLASSO) to directly exploit the sparsity pattern from data to build graph structures.
Second, we fit these graph structures and the input data into a Graph Convolutional Recurrent Network (GCRN) to train a forecasting model.
arXiv Detail & Related papers (2023-06-29T16:48:00Z) - NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search [55.75621026447599]
We propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS.
Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures.
Based on our proposed benchmark, the performance of GNN architectures can be directly obtained by a look-up table without any further computation.
arXiv Detail & Related papers (2022-06-18T10:17:15Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - GNNAutoScale: Scalable and Expressive Graph Neural Networks via
Historical Embeddings [51.82434518719011]
GNNAutoScale (GAS) is a framework for scaling arbitrary message-passing GNNs to large graphs.
Gas prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations.
Gas reaches state-of-the-art performance on large-scale graphs.
arXiv Detail & Related papers (2021-06-10T09:26:56Z) - Efficient Graph Deep Learning in TensorFlow with tf_geometric [53.237754811019464]
We introduce tf_geometric, an efficient and friendly library for graph deep learning.
tf_geometric provides kernel libraries for building Graph Neural Networks (GNNs) as well as implementations of popular GNNs.
The kernel libraries consist of infrastructures for building efficient GNNs, including graph data structures, graph map-reduce framework, graph mini-batch strategy, etc.
arXiv Detail & Related papers (2021-01-27T17:16:36Z) - Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data.
Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks.
In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.