Related papers: GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding

GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding

URL: http://arxiv.org/abs/2110.06196v3
Date: Sun, 7 May 2023 17:43:19 GMT
Title: GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding
Authors: Luca Cappelletti, Tommaso Fontana, Elena Casiraghi, Vida Ravanmehr, Tiffany J.Callahan, Carlos Cano, Marcin P. Joachimiak, Christopher J. Mungall, Peter N. Robinson, Justin Reese and Giorgio Valentini
Abstract summary: We present GRAPE, a software resource for graph processing and embedding. It can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods.
Score: 0.5035217505850539
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph Representation Learning (GRL) methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and embedding that can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as a competitive edge and node label prediction performance. GRAPE comprises about 1.7 million well-documented lines of Python and Rust code and provides 69 node embedding methods, 25 inference models, a collection of efficient graph processing utilities and over 80,000 graphs from the literature and other sources. Standardized interfaces allow seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of GRL methods, therefore also positioning GRAPE as a software resource to perform a fair comparison between methods and libraries for graph processing and embedding.

Related papers

RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline. RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components. Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z)
GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial Graphs [9.024357901512928]
Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges.<n>Existing solutions face significant trade-offs: online subgraph generation is limited to a single machine, resulting in severe performance bottlenecks.<n>We propose textbfGraphGen+, an integrated framework that synchronizes distributed subgraph generation with in-memory graph learning.
arXiv Detail & Related papers (2025-03-08T13:29:42Z)
Predictive Query-based Pipeline for Graph Data [0.0]
Graph embedding techniques simplify the analysis and processing of large-scale graphs. Several approaches, such as GraphSAGE, Node2Vec, and FastRP, offer efficient methods for generating graph embeddings. By storing embeddings as node properties, it is possible to compare different embedding techniques and evaluate their effectiveness.
arXiv Detail & Related papers (2024-12-13T08:03:57Z)
Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper. The process involves generating intermediate prompts for each instance using a lightweight architecture. Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z)
FedGraph: A Research Library and Benchmark for Federated Graph Learning [40.257355007504074]
We introduce FedGraph, a research library built for practical distributed deployment and benchmarking in federated graph learning. FedGraph supports a range of state-of-the-art graph learning methods and includes built-in profiling tools to evaluate system performance. We demonstrate the first privacy-preserving federated learning system to run on graphs with 100 million nodes.
arXiv Detail & Related papers (2024-10-08T20:18:18Z)
GraphStorm: all-in-one graph machine learning framework for industry applications [75.23076561638348]
GraphStorm is an end-to-end solution for scalable graph construction, graph model training and inference. Every component in GraphStorm can operate on graphs with billions of nodes and can scale model training and inference to different hardware without changing any code. GraphStorm has been used and deployed for over a dozen billion-scale industry applications after its release in May 2023.
arXiv Detail & Related papers (2024-06-10T04:56:16Z)
MGNet: Learning Correspondences via Multiple Graphs [78.0117352211091]
Learning correspondences aims to find correct correspondences from the initial correspondence set with an uneven correspondence distribution and a low inlier rate. Recent advances usually use graph neural networks (GNNs) to build a single type of graph or stack local graphs into the global one to complete the task. We propose MGNet to effectively combine multiple complementary graphs.
arXiv Detail & Related papers (2024-01-10T07:58:44Z)
Distributed Graph Embedding with Information-Oriented Random Walks [16.290803469068145]
Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks. We present a general-purpose, distributed, information-centric random walk-based graph embedding framework, DistGER, which can scale to embed billion-edge graphs. D DistGER exhibits 2.33x-129x acceleration, 45% reduction in cross-machines communication, and > 10% effectiveness improvement in downstream tasks.
arXiv Detail & Related papers (2023-03-28T03:11:21Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE. AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution. Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z)
CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency. In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries. We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z)
Accelerating Graph Sampling for Graph Machine Learning using GPUs [2.9383911860380127]
NextDoor is a system designed to perform graph sampling on GPU resources. NextDoor employs a new approach to graph sampling that we call transit-parallelism. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems.
arXiv Detail & Related papers (2020-09-14T19:03:33Z)
Graph topology inference benchmarks for machine learning [16.857405938139525]
We introduce several benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature.
arXiv Detail & Related papers (2020-07-16T09:40:32Z)
SIGN: Scalable Inception Graph Neural Networks [4.5158585619109495]
We propose a new, efficient and scalable graph deep learning architecture that sidesteps the need for graph sampling. Our architecture allows using different local graph operators to best suit the task at hand. We obtain state-of-the-art results on ogbn-papers100M, the largest public graph dataset, with over 110 million nodes and 1.5 billion edges.
arXiv Detail & Related papers (2020-04-23T14:46:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.