Advising OpenMP Parallelization via a Graph-Based Approach with
Transformers
- URL: http://arxiv.org/abs/2305.11999v1
- Date: Tue, 16 May 2023 16:56:10 GMT
- Title: Advising OpenMP Parallelization via a Graph-Based Approach with
Transformers
- Authors: Tal Kadosh, Nadav Schneider, Niranjan Hasabnis, Timothy Mattson, Yuval
Pinter, and Gal Oren
- Abstract summary: We propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared-memory attributes in parallel code.
OMPify is based on a Transformer-based model that leverages a graph-based representation of source code.
Our results demonstrate that OMPify outperforms existing approaches, the general-purposed and popular ChatGPT and targeted PragFormer models.
- Score: 2.393682571484038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is an ever-present need for shared memory parallelization schemes to
exploit the full potential of multi-core architectures. The most common
parallelization API addressing this need today is OpenMP. Nevertheless, writing
parallel code manually is complex and effort-intensive. Thus, many
deterministic source-to-source (S2S) compilers have emerged, intending to
automate the process of translating serial to parallel code. However, recent
studies have shown that these compilers are impractical in many scenarios. In
this work, we combine the latest advancements in the field of AI and natural
language processing (NLP) with the vast amount of open-source code to address
the problem of automatic parallelization. Specifically, we propose a novel
approach, called OMPify, to detect and predict the OpenMP pragmas and
shared-memory attributes in parallel code, given its serial version. OMPify is
based on a Transformer-based model that leverages a graph-based representation
of source code that exploits the inherent structure of code. We evaluated our
tool by predicting the parallelization pragmas and attributes of a large corpus
of (over 54,000) snippets of serial code written in C and C++ languages
(Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing
approaches, the general-purposed and popular ChatGPT and targeted PragFormer
models, in terms of F1 score and accuracy. Specifically, OMPify achieves up to
90% accuracy on commonly-used OpenMP benchmark tests such as NAS, SPEC, and
PolyBench. Additionally, we performed an ablation study to assess the impact of
different model components and present interesting insights derived from the
study. Lastly, we also explored the potential of using data augmentation and
curriculum learning techniques to improve the model's robustness and
generalization capabilities.
Related papers
- OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation [4.266086505323998]
This paper introduces OMPar, an AI-driven tool designed to automate the parallelization of C/C++ code using OpenMP pragmas.
OMPar integrates Large Language Models (LLMs) through two key components: OMPify, which assesses loop parallelization potential, and MonoCoder-OMP, a new fine-tuned model which generates precise OpenMP pragmas.
arXiv Detail & Related papers (2024-09-23T07:39:01Z) - CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation.
CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details.
Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z) - Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass.
In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z) - MPIrigen: MPI Code Generation through Domain-Specific Language Models [3.5352856644774806]
This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs.
We introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI.
The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation.
arXiv Detail & Related papers (2024-02-14T12:24:21Z) - MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with
Transformers [3.2164100882807913]
Message Passing Interface (MPI) plays a crucial role in distributed memory parallelization across multiple nodes.
We develop MPI-RICAL, a data-driven programming-assistance tool that assists programmers in writing domain decomposition based distributed memory parallelization code.
We also introduce MPICodeCorpus, the first publicly available corpus of MPI-based parallel programs that is created by mining more than 15,000 open-source repositories on GitHub.
arXiv Detail & Related papers (2023-05-16T13:50:24Z) - Learning to Parallelize with OpenMP by Augmented Heterogeneous AST
Representation [7.750212995537728]
We propose a novel graph-based learning approach called Graph2Par that utilizes a heterogeneous augmented abstract syntax tree (Augmented-AST) representation for code.
We create an OMP_Serial dataset with 18598 parallelizable and 13972 non-parallelizable loops to train the machine learning models.
Our results show that our proposed approach achieves the accuracy of parallelizable code region detection with 85% accuracy and outperforms the state-of-the-art token-based machine learning approach.
arXiv Detail & Related papers (2023-05-09T21:57:15Z) - QParallel: Explicit Parallelism for Programming Quantum Computers [62.10004571940546]
We present a language extension for parallel quantum programming.
QParallel removes ambiguities concerning parallelism in current quantum programming languages.
We introduce a tool that guides programmers in the placement of parallel regions by identifying the subroutines that profit most from parallelization.
arXiv Detail & Related papers (2022-10-07T16:35:16Z) - Learning to Parallelize in a Shared-Memory Environment with Transformers [3.340971990034025]
OpenMP is the most comprehensive API that implements shared memory parallelization schemes.
Many source-to-source (S2S) compilers have been created over the years, tasked with inserting OpenMP directives into code automatically.
In this work, we propose leveraging recent advances in ML techniques, specifically in natural language processing (NLP), to replace S2S compilers altogether.
arXiv Detail & Related papers (2022-04-27T10:39:52Z) - PolyDL: Polyhedral Optimizations for Creation of High Performance DL
primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives.
We develop novel data reuse analysis algorithms using the polyhedral model.
We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.