AUTOPARLLM: GNN-Guided Automatic Code Parallelization using Large
Language Models
- URL: http://arxiv.org/abs/2310.04047v2
- Date: Mon, 9 Oct 2023 02:35:47 GMT
- Title: AUTOPARLLM: GNN-Guided Automatic Code Parallelization using Large
Language Models
- Authors: Quazi Ishtiaque Mahmud, Ali TehraniJamsaz, Hung D Phan, Nesreen K.
Ahmed and Ali Jannesari
- Abstract summary: AUTOPARLLM is a framework for automatically discovering parallelism and generating the parallel version of sequential written programs.
We evaluate it on 11 applications of 2 well-known benchmark suites: NAS Parallel Benchmark and Rodinia Benchmark.
Our results show that AUTOPARLLM is indeed effective in improving the state-of-the-art LLM-based models for the task of parallel code generation.
- Score: 13.514916184776107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parallelizing sequentially written programs is a challenging task. Even
experienced developers need to spend considerable time finding parallelism
opportunities and then actually writing parallel versions of sequentially
written programs. To address this issue, we present AUTOPARLLM, a framework for
automatically discovering parallelism and generating the parallel version of
the sequentially written program. Our framework consists of two major
components: i) a heterogeneous Graph Neural Network (GNN) based parallelism
discovery and parallel pattern detection module, and ii) an LLM-based code
generator to generate the parallel counterpart of the sequential programs. We
use the GNN to learn the flow-aware characteristics of the programs to identify
parallel regions in sequential programs and then construct an enhanced prompt
using the GNN's results for the LLM-based generator to finally produce the
parallel counterparts of the sequential programs. We evaluate AUTOPARLLM on 11
applications of 2 well-known benchmark suites: NAS Parallel Benchmark and
Rodinia Benchmark. Our results show that AUTOPARLLM is indeed effective in
improving the state-of-the-art LLM-based models for the task of parallel code
generation in terms of multiple code generation metrics. AUTOPARLLM also
improves the average runtime of the parallel code generated by the
state-of-the-art LLMs by as high as 3.4% and 2.9% for the NAS Parallel
Benchmark and Rodinia Benchmark respectively. Additionally, to overcome the
issue that well-known metrics for translation evaluation have not been
optimized to evaluate the quality of the generated parallel code, we propose
OMPScore for evaluating the quality of the generated code. We show that
OMPScore exhibits a better correlation with human judgment than existing
metrics, measured by up to 75% improvement of Spearman correlation.
Related papers
- COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - ParallelSpec: Parallel Drafter for Efficient Speculative Decoding [62.68430939686566]
We present ParallelSpec, an alternative to auto-regressive drafting strategies in state-of-the-art speculative decoding approaches.
In contrast to auto-regressive drafting in the speculative stage, we train a parallel drafter to serve as an efficient speculative model.
arXiv Detail & Related papers (2024-10-08T01:05:08Z) - OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation [4.266086505323998]
This paper introduces OMPar, an AI-driven tool designed to automate the parallelization of C/C++ code using OpenMP pragmas.
OMPar integrates Large Language Models (LLMs) through two key components: OMPify, which assesses loop parallelization potential, and MonoCoder-OMP, a new fine-tuned model which generates precise OpenMP pragmas.
arXiv Detail & Related papers (2024-09-23T07:39:01Z) - MPIrigen: MPI Code Generation through Domain-Specific Language Models [3.5352856644774806]
This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs.
We introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI.
The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation.
arXiv Detail & Related papers (2024-02-14T12:24:21Z) - Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster [61.83949316226113]
FastCoT is a model-agnostic framework based on parallel decoding.
We show that FastCoT saves inference time by nearly 20% with only a negligible performance drop compared to the regular approach.
arXiv Detail & Related papers (2023-11-14T15:56:18Z) - Advising OpenMP Parallelization via a Graph-Based Approach with
Transformers [2.393682571484038]
We propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared-memory attributes in parallel code.
OMPify is based on a Transformer-based model that leverages a graph-based representation of source code.
Our results demonstrate that OMPify outperforms existing approaches, the general-purposed and popular ChatGPT and targeted PragFormer models.
arXiv Detail & Related papers (2023-05-16T16:56:10Z) - Learning to Parallelize with OpenMP by Augmented Heterogeneous AST
Representation [7.750212995537728]
We propose a novel graph-based learning approach called Graph2Par that utilizes a heterogeneous augmented abstract syntax tree (Augmented-AST) representation for code.
We create an OMP_Serial dataset with 18598 parallelizable and 13972 non-parallelizable loops to train the machine learning models.
Our results show that our proposed approach achieves the accuracy of parallelizable code region detection with 85% accuracy and outperforms the state-of-the-art token-based machine learning approach.
arXiv Detail & Related papers (2023-05-09T21:57:15Z) - Inference with Reference: Lossless Acceleration of Large Language Models [97.04200102556551]
LLMA is an accelerator to speed up Large Language Model (LLM) inference with references.
It is motivated by the observation that there are abundant identical text spans between the decoding result by an LLM and the reference that is available in many real world scenarios.
arXiv Detail & Related papers (2023-04-10T09:55:14Z) - QParallel: Explicit Parallelism for Programming Quantum Computers [62.10004571940546]
We present a language extension for parallel quantum programming.
QParallel removes ambiguities concerning parallelism in current quantum programming languages.
We introduce a tool that guides programmers in the placement of parallel regions by identifying the subroutines that profit most from parallelization.
arXiv Detail & Related papers (2022-10-07T16:35:16Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.