ACPO: AI-Enabled Compiler-Driven Program Optimization
- URL: http://arxiv.org/abs/2312.09982v2
- Date: Mon, 11 Mar 2024 19:24:41 GMT
- Title: ACPO: AI-Enabled Compiler-Driven Program Optimization
- Authors: Amir H. Ashouri, Muhammad Asif Manzoor, Duc Minh Vu, Raymond Zhang,
Ziwen Wang, Angel Zhang, Bryan Chan, Tomasz S. Czajkowski and Yaoqing Gao
- Abstract summary: ACPO is a framework to provide LLVM with simple and comprehensive tools to benefit from employing ML models for different optimization passes.
We show that ACPO model for Loop Unroll is able to gain on average 4% compared to LLVM's O3 optimization when deployed on Polybench.
- Score: 1.879008610342411
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The key to performance optimization of a program is to decide correctly when
a certain transformation should be applied by a compiler. This is an ideal
opportunity to apply machine-learning models to speed up the tuning process;
while this realization has been around since the late 90s, only recent
advancements in ML enabled a practical application of ML to compilers as an
end-to-end framework.
This paper presents ACPO: \textbf{\underline{A}}I-Enabled
\textbf{\underline{C}}ompiler-driven \textbf{\underline{P}}rogram
\textbf{\underline{O}}ptimization; a novel framework to provide LLVM with
simple and comprehensive tools to benefit from employing ML models for
different optimization passes. We first showcase the high-level view, class
hierarchy, and functionalities of ACPO and subsequently, demonstrate a couple
of use cases of ACPO by ML-enabling the Loop Unroll and Function Inlining
passes and describe how ACPO can be leveraged to optimize other passes.
Experimental results reveal that ACPO model for Loop Unroll is able to gain on
average 4\% compared to LLVM's O3 optimization when deployed on Polybench.
Furthermore, by adding the Inliner model as well, ACPO is able to provide up to
4.5\% and 2.4\% on Polybench and Cbench compared with LLVM's O3 optimization,
respectively.
Related papers
- Optimization-based Structural Pruning for Large Language Models without Back-Propagation [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models (LLMs)
Our method learns the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU, and our pruned models outperform the state-of-the-arts w.r.t. perplexity.
arXiv Detail & Related papers (2024-06-15T09:31:03Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - AffineQuant: Affine Transformation Quantization for Large Language Models [58.45460102764]
Post-Training Quantization (PTQ) has emerged as a subject of considerable interest due to its compression efficiency and cost-effectiveness in the context of training.
Existing PTQ methods for Large-scale Language Models (LLMs) limit the optimization scope to scaling transformations between pre- and post-quantization weights.
In this paper, we advocate for the direct optimization using equivalent Affine transformations in PTQ (AffineQuant)
arXiv Detail & Related papers (2024-03-19T08:40:21Z) - Compiler generated feedback for Large Language Models [3.86901256759401]
We introduce a novel paradigm in compiler optimization powered by Large Language Models with compiler feedback to optimize the code size of LLVM assembly.
The model takes unoptimized LLVM IR as input and produces optimized IR, the best optimization passes, and instruction counts of both unoptimized and optimized IRs.
arXiv Detail & Related papers (2024-03-18T23:25:13Z) - Extreme Compression of Large Language Models via Additive Quantization [59.3122859349777]
AQLM is first scheme that is optimal in terms of accuracy-vs-model-size when compressing to less than 3 bits per parameter.
We provide fast GPU and CPU implementations of AQLM for token generation.
arXiv Detail & Related papers (2024-01-11T18:54:44Z) - CoLLiE: Collaborative Training of Large Language Models in an Efficient
Way [59.09824823710863]
CoLLiE is an efficient library that facilitates collaborative training of large language models.
With its modular design and comprehensive functionality, CoLLiE offers a balanced blend of efficiency, ease of use, and customization.
arXiv Detail & Related papers (2023-12-01T08:02:16Z) - MLGOPerf: An ML Guided Inliner to Optimize Performance [7.314201117946244]
This paper presents the first end-to-end framework capable of optimizing performance using LLVM's ML-Inliner.
It employs a secondary ML model to generate rewards used for training a retargeted Reinforcement learning agent.
It does so by predicting the post-inlining speedup of a function under analysis and it enables a fast training framework for the primary model.
arXiv Detail & Related papers (2022-07-18T05:47:29Z) - A Reinforcement Learning Environment for Polyhedral Optimizations [68.8204255655161]
We propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP)
Instead of using transformations, the formulation is based on an abstract space of possible schedules.
Our generic MDP formulation enables using reinforcement learning to learn optimization policies over a wide range of loops.
arXiv Detail & Related papers (2021-04-28T12:41:52Z) - MLGO: a Machine Learning Guided Compiler Optimizations Framework [0.0]
This work is the first full integration of machine learning in a complex compiler pass in a real-world setting.
We use two different ML algorithms to train the inlining-for-size model, and achieve up to 7% size reduction.
The same model generalizes well to a diversity of real-world targets, as well as to the same set of targets after months of active development.
arXiv Detail & Related papers (2021-01-13T00:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.