Related papers: GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

URL: http://arxiv.org/abs/2103.03653v1
Date: Fri, 5 Mar 2021 13:26:18 GMT
Title: GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra
Authors: Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beranek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Ozdemir, Adrian Balla, Marcin Copik, Philipp Lindenberger, Pavel Kalvoda, Marek Konieczny, Onur Mutlu, Torsten Hoefler
Abstract summary: GraphMineSuite (GMS) is a benchmarking suite for graph mining algorithms. GMS comes with a benchmark specification based on extensive review, literature prescribing representative problems, algorithms, and datasets.
Score: 9.814439564341761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose GraphMineSuite (GMS): the first benchmarking suite for graph mining that facilitates evaluating and constructing high-performance graph mining algorithms. First, GMS comes with a benchmark specification based on extensive literature review, prescribing representative problems, algorithms, and datasets. Second, GMS offers a carefully designed software platform for seamless testing of different fine-grained elements of graph mining algorithms, such as graph representations or algorithm subroutines. The platform includes parallel implementations of more than 40 considered baselines, and it facilitates developing complex and fast mining algorithms. High modularity is possible by harnessing set algebra operations such as set intersection and difference, which enables breaking complex graph mining algorithms into simple building blocks that can be separately experimented with. GMS is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graph mining algorithms, enabling more insightful evaluation. As use cases, we harness GMS to rapidly redesign and accelerate state-of-the-art baselines of core graph mining problems: degeneracy reordering (by up to >2x), maximal clique listing (by up to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x), also obtaining better theoretical performance bounds.

Related papers

NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes [25.173078967881803]
Retrieval-augmented generation (RAG) empowers large language models to access external and private corpus. Current graph-based RAG approaches seldom prioritize the design of graph structures. Inadequately designed graph not only impede the seamless integration of diverse graph algorithms but also result in workflow inconsistencies. We propose NodeRAG, a graph-centric framework introducing heterogeneous graph structures.
arXiv Detail & Related papers (2025-04-15T18:24:00Z)
RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline. RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components. Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z)
SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning. We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task. We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z)
NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search [55.75621026447599]
We propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS. Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures. Based on our proposed benchmark, the performance of GNN architectures can be directly obtained by a look-up table without any further computation.
arXiv Detail & Related papers (2022-06-18T10:17:15Z)
End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework. It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core. In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
Boosting Graph Embedding on a Single GPU [3.093890460224435]
We present GOSH, a GPU-based tool for embedding large-scale graphs with minimum hardware constraints. It employs a novel graph coarsening algorithm to enhance the impact of updates and minimize the work for embedding. It also incorporates a decomposition schema that enables any arbitrarily large graph to be embedded with a single GPU.
arXiv Detail & Related papers (2021-10-19T15:25:04Z)
GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding [0.5035217505850539]
We present GRAPE, a software resource for graph processing and embedding. It can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods.
arXiv Detail & Related papers (2021-10-12T17:49:46Z)
Understanding Coarsening for Embedding Large-Scale Graphs [3.6739949215165164]
Proper analysis of graphs with Machine Learning (ML) algorithms has the potential to yield far-reaching insights into many areas of research and industry. The irregular structure of graph data constitutes an obstacle for running ML tasks on graphs. We analyze the impact of the coarsening quality on the embedding performance both in terms of speed and accuracy.
arXiv Detail & Related papers (2020-09-10T15:06:33Z)
Inverse Graph Identification: Can We Identify Node Labels Given Graph Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications. This paper defines a novel problem dubbed Inverse Graph Identification (IGI) We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z)
MC2G: An Efficient Algorithm for Matrix Completion with Social and Item Similarity Graphs [85.89744949820376]
MC2G is an algorithm that performs matrix completion in the presence of social and item similarity graphs. It is based on spectral clustering and local refinement steps. We show via extensive experiments on both synthetic and real datasets that MC2G outperforms other state-of-the-art matrix completion algorithms.
arXiv Detail & Related papers (2020-06-08T06:11:37Z)
MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle. Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
GLSearch: Maximum Common Subgraph Detection via Learning to Search [33.9052190473029]
We propose GLSearch, a Graph Neural Network (GNN) based learning to search model. Our model is built upon the branch and bound bound, which selects one pair of nodes from the two input graphs to expand at a time. Our GLSearch can be potentially extended to solve many other problems with constraints on graphs.
arXiv Detail & Related papers (2020-02-08T10:03:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.