GraphMineSuite: Enabling High-Performance and Programmable Graph Mining
Algorithms with Set Algebra
- URL: http://arxiv.org/abs/2103.03653v1
- Date: Fri, 5 Mar 2021 13:26:18 GMT
- Title: GraphMineSuite: Enabling High-Performance and Programmable Graph Mining
Algorithms with Set Algebra
- Authors: Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo
Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beranek, Kacper Janda,
Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Ozdemir,
Adrian Balla, Marcin Copik, Philipp Lindenberger, Pavel Kalvoda, Marek
Konieczny, Onur Mutlu, Torsten Hoefler
- Abstract summary: GraphMineSuite (GMS) is a benchmarking suite for graph mining algorithms.
GMS comes with a benchmark specification based on extensive review, literature prescribing representative problems, algorithms, and datasets.
- Score: 9.814439564341761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose GraphMineSuite (GMS): the first benchmarking suite for graph
mining that facilitates evaluating and constructing high-performance graph
mining algorithms. First, GMS comes with a benchmark specification based on
extensive literature review, prescribing representative problems, algorithms,
and datasets. Second, GMS offers a carefully designed software platform for
seamless testing of different fine-grained elements of graph mining algorithms,
such as graph representations or algorithm subroutines. The platform includes
parallel implementations of more than 40 considered baselines, and it
facilitates developing complex and fast mining algorithms. High modularity is
possible by harnessing set algebra operations such as set intersection and
difference, which enables breaking complex graph mining algorithms into simple
building blocks that can be separately experimented with. GMS is supported with
a broad concurrency analysis for portability in performance insights, and a
novel performance metric to assess the throughput of graph mining algorithms,
enabling more insightful evaluation. As use cases, we harness GMS to rapidly
redesign and accelerate state-of-the-art baselines of core graph mining
problems: degeneracy reordering (by up to >2x), maximal clique listing (by up
to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x),
also obtaining better theoretical performance bounds.
Related papers
- SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search [55.75621026447599]
We propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS.
Specifically, we construct a unified, expressive yet compact search space, covering 26,206 unique graph neural network (GNN) architectures.
Based on our proposed benchmark, the performance of GNN architectures can be directly obtained by a look-up table without any further computation.
arXiv Detail & Related papers (2022-06-18T10:17:15Z) - End-to-end Mapping in Heterogeneous Systems Using Graph Representation
Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework.
It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core.
In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Boosting Graph Embedding on a Single GPU [3.093890460224435]
We present GOSH, a GPU-based tool for embedding large-scale graphs with minimum hardware constraints.
It employs a novel graph coarsening algorithm to enhance the impact of updates and minimize the work for embedding.
It also incorporates a decomposition schema that enables any arbitrarily large graph to be embedded with a single GPU.
arXiv Detail & Related papers (2021-10-19T15:25:04Z) - GRAPE for Fast and Scalable Graph Processing and random walk-based
Embedding [0.5035217505850539]
We present GRAPE, a software resource for graph processing and embedding.
It can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods.
arXiv Detail & Related papers (2021-10-12T17:49:46Z) - Understanding Coarsening for Embedding Large-Scale Graphs [3.6739949215165164]
Proper analysis of graphs with Machine Learning (ML) algorithms has the potential to yield far-reaching insights into many areas of research and industry.
The irregular structure of graph data constitutes an obstacle for running ML tasks on graphs.
We analyze the impact of the coarsening quality on the embedding performance both in terms of speed and accuracy.
arXiv Detail & Related papers (2020-09-10T15:06:33Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z) - MC2G: An Efficient Algorithm for Matrix Completion with Social and Item
Similarity Graphs [85.89744949820376]
MC2G is an algorithm that performs matrix completion in the presence of social and item similarity graphs.
It is based on spectral clustering and local refinement steps.
We show via extensive experiments on both synthetic and real datasets that MC2G outperforms other state-of-the-art matrix completion algorithms.
arXiv Detail & Related papers (2020-06-08T06:11:37Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z) - GLSearch: Maximum Common Subgraph Detection via Learning to Search [33.9052190473029]
We propose GLSearch, a Graph Neural Network (GNN) based learning to search model.
Our model is built upon the branch and bound bound, which selects one pair of nodes from the two input graphs to expand at a time.
Our GLSearch can be potentially extended to solve many other problems with constraints on graphs.
arXiv Detail & Related papers (2020-02-08T10:03:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.