GRAPE for Fast and Scalable Graph Processing and random walk-based
Embedding
- URL: http://arxiv.org/abs/2110.06196v3
- Date: Sun, 7 May 2023 17:43:19 GMT
- Title: GRAPE for Fast and Scalable Graph Processing and random walk-based
Embedding
- Authors: Luca Cappelletti, Tommaso Fontana, Elena Casiraghi, Vida Ravanmehr,
Tiffany J.Callahan, Carlos Cano, Marcin P. Joachimiak, Christopher J.
Mungall, Peter N. Robinson, Justin Reese and Giorgio Valentini
- Abstract summary: We present GRAPE, a software resource for graph processing and embedding.
It can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods.
- Score: 0.5035217505850539
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Representation Learning (GRL) methods opened new avenues for addressing
complex, real-world problems represented by graphs. However, many graphs used
in these applications comprise millions of nodes and billions of edges and are
beyond the capabilities of current methods and software implementations. We
present GRAPE, a software resource for graph processing and embedding that can
scale with big graphs by using specialized and smart data structures,
algorithms, and a fast parallel implementation of random walk-based methods.
Compared with state-of-the-art software resources, GRAPE shows an improvement
of orders of magnitude in empirical space and time complexity, as well as a
competitive edge and node label prediction performance. GRAPE comprises about
1.7 million well-documented lines of Python and Rust code and provides 69 node
embedding methods, 25 inference models, a collection of efficient graph
processing utilities and over 80,000 graphs from the literature and other
sources. Standardized interfaces allow seamless integration of third-party
libraries, while ready-to-use and modular pipelines permit an easy-to-use
evaluation of GRL methods, therefore also positioning GRAPE as a software
resource to perform a fair comparison between methods and libraries for graph
processing and embedding.
Related papers
- FedGraph: A Research Library and Benchmark for Federated Graph Learning [40.257355007504074]
We introduce FedGraph, a research library built for practical distributed deployment and benchmarking in federated graph learning.
FedGraph supports a range of state-of-the-art graph learning methods and includes built-in profiling tools to evaluate system performance.
We demonstrate the first privacy-preserving federated learning system to run on graphs with 100 million nodes.
arXiv Detail & Related papers (2024-10-08T20:18:18Z) - GraphStorm: all-in-one graph machine learning framework for industry applications [75.23076561638348]
GraphStorm is an end-to-end solution for scalable graph construction, graph model training and inference.
Every component in GraphStorm can operate on graphs with billions of nodes and can scale model training and inference to different hardware without changing any code.
GraphStorm has been used and deployed for over a dozen billion-scale industry applications after its release in May 2023.
arXiv Detail & Related papers (2024-06-10T04:56:16Z) - MGNet: Learning Correspondences via Multiple Graphs [78.0117352211091]
Learning correspondences aims to find correct correspondences from the initial correspondence set with an uneven correspondence distribution and a low inlier rate.
Recent advances usually use graph neural networks (GNNs) to build a single type of graph or stack local graphs into the global one to complete the task.
We propose MGNet to effectively combine multiple complementary graphs.
arXiv Detail & Related papers (2024-01-10T07:58:44Z) - Distributed Graph Embedding with Information-Oriented Random Walks [16.290803469068145]
Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks.
We present a general-purpose, distributed, information-centric random walk-based graph embedding framework, DistGER, which can scale to embed billion-edge graphs.
D DistGER exhibits 2.33x-129x acceleration, 45% reduction in cross-machines communication, and > 10% effectiveness improvement in downstream tasks.
arXiv Detail & Related papers (2023-03-28T03:11:21Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Learnable Graph Matching: Incorporating Graph Partitioning with Deep
Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task.
Existing methods mostly ignore the context information among tracklets and intra-frame detections.
We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z) - CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency.
In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries.
We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z) - Accelerating Graph Sampling for Graph Machine Learning using GPUs [2.9383911860380127]
NextDoor is a system designed to perform graph sampling on GPU resources.
NextDoor employs a new approach to graph sampling that we call transit-parallelism.
We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems.
arXiv Detail & Related papers (2020-09-14T19:03:33Z) - Graph topology inference benchmarks for machine learning [16.857405938139525]
We introduce several benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods.
We also contrast some of the most prominent techniques in the literature.
arXiv Detail & Related papers (2020-07-16T09:40:32Z) - SIGN: Scalable Inception Graph Neural Networks [4.5158585619109495]
We propose a new, efficient and scalable graph deep learning architecture that sidesteps the need for graph sampling.
Our architecture allows using different local graph operators to best suit the task at hand.
We obtain state-of-the-art results on ogbn-papers100M, the largest public graph dataset, with over 110 million nodes and 1.5 billion edges.
arXiv Detail & Related papers (2020-04-23T14:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.