Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher
- URL: http://arxiv.org/abs/2404.19735v2
- Date: Mon, 17 Jun 2024 17:33:18 GMT
- Title: Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher
- Authors: Mohsen Koohi Esfahani, Marco D'Antonio, Syed Ibtisam Tauhidi, Thai Son Mai, Hans Vandierendonck,
- Abstract summary: ParaGrapher is a high-performance API and library for loading large-scale and compressed graphs.
We present the design of ParaGrapher and present a performance model of graph decompression.
Our evaluation shows that ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution.
- Score: 3.298283787389057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libraries capable of loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions on a wide range of graph algorithms, and (iii) to facilitate easy and fast comparison over different graph frameworks. To that end, we present ParaGrapher, a high-performance API and library for loading large-scale and compressed graphs. ParaGrapher supports different types of requests for accessing graphs in shared- and distributed-memory and out-of-core graph processing. We explain the design of ParaGrapher and present a performance model of graph decompression, which is used for evaluation of ParaGrapher over three storage types. Our evaluation shows that by decompressing compressed graphs in WebGraph format, ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution in comparison to the binary and textual formats. ParaGrapher is available online on https://blogs.qub.ac.uk/DIPSA/ParaGrapher/.
Related papers
- GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial Graphs [9.024357901512928]
Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges.
Existing solutions face significant trade-offs: online subgraph generation is limited to a single machine, resulting in severe performance bottlenecks.
We propose textbfGraphGen+, an integrated framework that synchronizes distributed subgraph generation with in-memory graph learning.
arXiv Detail & Related papers (2025-03-08T13:29:42Z) - InstructG2I: Synthesizing Images from Multimodal Attributed Graphs [50.852150521561676]
We propose a graph context-conditioned diffusion model called InstructG2I.
InstructG2I first exploits the graph structure and multimodal information to conduct informative neighbor sampling.
A Graph-QFormer encoder adaptively encodes the graph nodes into an auxiliary set of graph prompts to guide the denoising process.
arXiv Detail & Related papers (2024-10-09T17:56:15Z) - GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval [19.225957670728622]
GraphSnapShot is a framework for fast cache, storage, retrieval and computation for graph learning.
In experiments, GraphSnapShot shows efficiency, it can achieve up to 30% training acceleration and 73% memory reduction.
arXiv Detail & Related papers (2024-06-25T20:00:32Z) - G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs.
We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs.
G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z) - Faster Optimization in S-Graphs Exploiting Hierarchy [8.17925295907622]
We present an improved version of S-Graphs exploiting the hierarchy to reduce the graph size by marginalizing redundant robot poses.
We show similar accuracy compared to the baseline while showing a 39.81% reduction in the computation time with respect to the baseline.
arXiv Detail & Related papers (2023-08-22T07:35:15Z) - Hybrid Graph: A Unified Graph Representation with Datasets and
Benchmarks for Complex Graphs [27.24150788635981]
We introduce the concept of hybrid graphs and present the Hybrid Graph Benchmark (HGB)
HGB contains 23 real-world hybrid graph datasets across various domains such as biology, social media, and e-commerce.
We provide an evaluation framework and a supporting framework to facilitate the training and evaluation of Graph Neural Networks (GNNs) on HGB.
arXiv Detail & Related papers (2023-06-08T11:15:34Z) - Distributed Graph Embedding with Information-Oriented Random Walks [16.290803469068145]
Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks.
We present a general-purpose, distributed, information-centric random walk-based graph embedding framework, DistGER, which can scale to embed billion-edge graphs.
D DistGER exhibits 2.33x-129x acceleration, 45% reduction in cross-machines communication, and > 10% effectiveness improvement in downstream tasks.
arXiv Detail & Related papers (2023-03-28T03:11:21Z) - CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph
Similarity Learning [65.1042892570989]
We propose a contrastive graph matching network (CGMN) for self-supervised graph similarity learning.
We employ two strategies, namely cross-view interaction and cross-graph interaction, for effective node representation learning.
We transform node representations into graph-level representations via pooling operations for graph similarity computation.
arXiv Detail & Related papers (2022-05-30T13:20:26Z) - Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph.
In this work, we experiment with the use of graph summarization techniques to compress the graph.
We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z) - Edge but not Least: Cross-View Graph Pooling [76.71497833616024]
This paper presents a cross-view graph pooling (Co-Pooling) method to better exploit crucial graph structure information.
Through cross-view interaction, edge-view pooling and node-view pooling seamlessly reinforce each other to learn more informative graph-level representations.
arXiv Detail & Related papers (2021-09-24T08:01:23Z) - High-Order Relation Construction and Mining for Graph Matching [36.880853889521845]
Iterated line graphs are introduced for the first time to describe high-order information.
We present a new graph matching method, called High-order Graph Matching Network (HGMN)
By imposing practical constraints, HGMN is made scalable to large-scale graphs.
arXiv Detail & Related papers (2020-10-09T03:30:02Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.