A Deep Learning Based Cost Model for Automatic Code Optimization
- URL: http://arxiv.org/abs/2104.04955v1
- Date: Sun, 11 Apr 2021 08:32:42 GMT
- Title: A Deep Learning Based Cost Model for Automatic Code Optimization
- Authors: Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham Leghettas, Kamel
Abdous, Taha Arbaoui, Karima Benatchba, Saman Amarasinghe
- Abstract summary: We present a novel deep learning based cost model for automatic code optimization.
It was integrated in the Tiramisu compiler to select the best code transformations.
The proposed model has only 16% of mean absolute percentage error in predicting speedups on full programs.
- Score: 0.24629531282150877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Enabling compilers to automatically optimize code has been a longstanding
goal for the compiler community. Efficiently solving this problem requires
using precise cost models. These models predict whether applying a sequence of
code transformations reduces the execution time of the program. Building an
analytical cost model to do so is hard in modern x86 architectures due to the
complexity of the microarchitecture. In this paper, we present a novel deep
learning based cost model for automatic code optimization. This model was
integrated in a search method and implemented in the Tiramisu compiler to
select the best code transformations. The input of the proposed model is a set
of simple features representing the unoptimized code and a sequence of code
transformations. The model predicts the speedup expected when the code
transformations are applied. Unlike previous models, the proposed one works on
full programs and does not rely on any heavy feature engineering. The proposed
model has only 16% of mean absolute percentage error in predicting speedups on
full programs. The proposed model enables Tiramisu to automatically find code
transformations that match or are better than state-of-the-art compilers
without requiring the same level of heavy feature engineering required by those
compilers.
Related papers
- Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - CompilerDream: Learning a Compiler World Model for General Code Optimization [58.87557583347996]
We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization.
It comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies.
It excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
arXiv Detail & Related papers (2024-04-24T09:20:33Z) - LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers [1.7529897611426233]
We introduce LOOPer, the first polyhedral autoscheduler that uses a deep-learning based cost model.
It supports the exploration of a large set of affine transformations, allowing the application of complex sequences of polyhedral transformations.
It also supports the optimization of programs with multiple loop nests and with rectangular and non-rectangular iteration domains.
arXiv Detail & Related papers (2024-03-18T07:22:31Z) - Extreme Compression of Large Language Models via Additive Quantization [59.3122859349777]
Our algorithm, called AQLM, generalizes the classic Additive Quantization (AQ) approach for information retrieval.
We provide fast GPU and CPU implementations of AQLM for token generation, which enable us to match or outperform optimized FP16 implementations for speed.
arXiv Detail & Related papers (2024-01-11T18:54:44Z) - Large Language Models for Compiler Optimization [22.52765975286403]
We present a transformer model trained from scratch to optimize LLVM assembly for code size.
We ask the model to predict the instruction counts before and after optimization, and the optimized code itself.
Our approach achieves a 3.0% improvement in reducing instruction counts over the compiler.
arXiv Detail & Related papers (2023-09-11T22:11:46Z) - ML-driven Hardware Cost Model for MLIR [1.2987894327817158]
We develop a machine learning-based cost model for high-level MLIR.
By considering the incoming MLIR as a text input a la NLP models we can apply well-known techniques from modern NLP research.
We show that these models can provide reasonably good estimates with low error bounds for various hardware characteristics of interest.
arXiv Detail & Related papers (2023-02-14T11:32:47Z) - Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models.
As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters.
For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z) - ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking
Inference [70.36083572306839]
This paper proposes a new training and inference paradigm for re-ranking.
We finetune a pretrained encoder-decoder model using in the form of document to query generation.
We show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference.
arXiv Detail & Related papers (2022-04-25T06:26:29Z) - MetaTune: Meta-Learning Based Cost Model for Fast and Efficient
Auto-tuning Frameworks [0.0]
This paper proposes MetaTune, a meta-learning based cost model that more quickly and accurately predicts the performance of optimized codes with pre-trained model parameters.
The framework provides 8 to 13% better inference time on average for four CNN models with comparable or lower optimization time while outperforming transfer learning by 10% in cross-platform cases.
arXiv Detail & Related papers (2021-02-08T13:59:08Z) - A Learned Performance Model for Tensor Processing Units [5.733911161090224]
We demonstrate a method of learning performance models from a corpus of graph programs for Processing Unit (TPU) instances.
We show that our learned model outperforms a heavily-optimized analytical performance model on two tasks.
It helps an autotuner discover faster programs in a setting where access to TPUs is limited or expensive.
arXiv Detail & Related papers (2020-08-03T17:24:52Z) - PolyDL: Polyhedral Optimizations for Creation of High Performance DL
primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives.
We develop novel data reuse analysis algorithms using the polyhedral model.
We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.