MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with
Transformers
- URL: http://arxiv.org/abs/2305.09438v3
- Date: Wed, 30 Aug 2023 14:56:16 GMT
- Title: MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with
Transformers
- Authors: Nadav Schneider, Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval
Pinter, Gal Oren
- Abstract summary: Message Passing Interface (MPI) plays a crucial role in distributed memory parallelization across multiple nodes.
We develop MPI-RICAL, a data-driven programming-assistance tool that assists programmers in writing domain decomposition based distributed memory parallelization code.
We also introduce MPICodeCorpus, the first publicly available corpus of MPI-based parallel programs that is created by mining more than 15,000 open-source repositories on GitHub.
- Score: 3.2164100882807913
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Message Passing Interface (MPI) plays a crucial role in distributed memory
parallelization across multiple nodes. However, parallelizing MPI code
manually, and specifically, performing domain decomposition, is a challenging,
error-prone task. In this paper, we address this problem by developing
MPI-RICAL, a novel data-driven, programming-assistance tool that assists
programmers in writing domain decomposition based distributed memory
parallelization code. Specifically, we train a supervised language model to
suggest MPI functions and their proper locations in the code on the fly. We
also introduce MPICodeCorpus, the first publicly available corpus of MPI-based
parallel programs that is created by mining more than 15,000 open-source
repositories on GitHub. Experimental results have been done on MPICodeCorpus
and more importantly, on a compiled benchmark of MPI-based parallel programs
for numerical computations that represent real-world scientific applications.
MPI-RICAL achieves F1 scores between 0.87-0.91 on these programs, demonstrating
its accuracy in suggesting correct MPI functions at appropriate code
locations.. The source code used in this work, as well as other relevant
sources, are available at:
https://github.com/Scientific-Computing-Lab-NRCN/MPI-rical
Related papers
- OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation [4.266086505323998]
This paper introduces OMPar, an AI-driven tool designed to automate the parallelization of C/C++ code using OpenMP pragmas.
OMPar integrates Large Language Models (LLMs) through two key components: OMPify, which assesses loop parallelization potential, and MonoCoder-OMP, a new fine-tuned model which generates precise OpenMP pragmas.
arXiv Detail & Related papers (2024-09-23T07:39:01Z) - Let the Code LLM Edit Itself When You Edit the Code [50.46536185784169]
underlinetextbfPositional textbfIntegrity textbfEncoding (PIE)
PIE reduces computational overhead by over 85% compared to the standard full recomputation approach.
Results demonstrate that PIE reduces computational overhead by over 85% compared to the standard full recomputation approach.
arXiv Detail & Related papers (2024-07-03T14:34:03Z) - MPIrigen: MPI Code Generation through Domain-Specific Language Models [3.5352856644774806]
This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs.
We introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI.
The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation.
arXiv Detail & Related papers (2024-02-14T12:24:21Z) - Linear-time Minimum Bayes Risk Decoding with Reference Aggregation [52.1701152610258]
Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations.
It requires the pairwise calculation of a utility metric, which has quadratic complexity.
We propose to approximate pairwise metric scores with scores calculated against aggregated reference representations.
arXiv Detail & Related papers (2024-02-06T18:59:30Z) - AUTOPARLLM: GNN-Guided Automatic Code Parallelization using Large
Language Models [13.514916184776107]
AUTOPARLLM is a framework for automatically discovering parallelism and generating the parallel version of sequential written programs.
We evaluate it on 11 applications of 2 well-known benchmark suites: NAS Parallel Benchmark and Rodinia Benchmark.
Our results show that AUTOPARLLM is indeed effective in improving the state-of-the-art LLM-based models for the task of parallel code generation.
arXiv Detail & Related papers (2023-10-06T06:51:16Z) - Advising OpenMP Parallelization via a Graph-Based Approach with
Transformers [2.393682571484038]
We propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared-memory attributes in parallel code.
OMPify is based on a Transformer-based model that leverages a graph-based representation of source code.
Our results demonstrate that OMPify outperforms existing approaches, the general-purposed and popular ChatGPT and targeted PragFormer models.
arXiv Detail & Related papers (2023-05-16T16:56:10Z) - PARTIME: Scalable and Parallel Processing Over Time with Deep Neural
Networks [68.96484488899901]
We present PARTIME, a library designed to speed up neural networks whenever data is continuously streamed over time.
PARTIME starts processing each data sample at the time in which it becomes available from the stream.
Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning.
arXiv Detail & Related papers (2022-10-17T14:49:14Z) - QParallel: Explicit Parallelism for Programming Quantum Computers [62.10004571940546]
We present a language extension for parallel quantum programming.
QParallel removes ambiguities concerning parallelism in current quantum programming languages.
We introduce a tool that guides programmers in the placement of parallel regions by identifying the subroutines that profit most from parallelization.
arXiv Detail & Related papers (2022-10-07T16:35:16Z) - Learning to Parallelize in a Shared-Memory Environment with Transformers [3.340971990034025]
OpenMP is the most comprehensive API that implements shared memory parallelization schemes.
Many source-to-source (S2S) compilers have been created over the years, tasked with inserting OpenMP directives into code automatically.
In this work, we propose leveraging recent advances in ML techniques, specifically in natural language processing (NLP), to replace S2S compilers altogether.
arXiv Detail & Related papers (2022-04-27T10:39:52Z) - Lossless Compression of Efficient Private Local Randomizers [55.657133416044104]
Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting.
In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server.
This has led to significant efforts on reducing the communication cost of LDP algorithms.
arXiv Detail & Related papers (2021-02-24T07:04:30Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.