Leveraging Reinforcement Learning and Large Language Models for Code
Optimization
- URL: http://arxiv.org/abs/2312.05657v1
- Date: Sat, 9 Dec 2023 19:50:23 GMT
- Title: Leveraging Reinforcement Learning and Large Language Models for Code
Optimization
- Authors: Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng Ping, Chenyu Zhou,
Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin
Nazarian, Paul Bogdan
- Abstract summary: This paper introduces a new framework to decrease the complexity of code optimization.
The proposed framework builds on large language models (LLMs) and reinforcement learning (RL)
We run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm.
- Score: 14.602997316032706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code optimization is a daunting task that requires a significant level of
expertise from experienced programmers. This level of expertise is not
sufficient when compared to the rapid development of new hardware
architectures. Towards advancing the whole code optimization process, recent
approaches rely on machine learning and artificial intelligence techniques.
This paper introduces a new framework to decrease the complexity of code
optimization. The proposed framework builds on large language models (LLMs) and
reinforcement learning (RL) and enables LLMs to receive feedback from their
environment (i.e., unit tests) during the fine-tuning process. We compare our
framework with existing state-of-the-art models and show that it is more
efficient with respect to speed and computational usage, as a result of the
decrement in training steps and its applicability to models with fewer
parameters. Additionally, our framework reduces the possibility of logical and
syntactical errors. Toward evaluating our approach, we run several experiments
on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement
learning algorithm. We adopt a variety of evaluation metrics with regards to
optimization quality, and speedup. The evaluation results demonstrate that the
proposed framework has similar results in comparison with existing models using
shorter training times and smaller pre-trained models. In particular, we
accomplish an increase of 5.6% and 2.2 over the baseline models concerning the
%OP T and SP metrics.
Related papers
- What Makes Large Language Models Reason in (Multi-Turn) Code Generation? [28.614888506962988]
Chain-of-thought has established itself as a popular vehicle for improving the outputs of large language models (LLMs)
We investigate the effects of a wide range of prompting strategies with a focus on automatic re-prompting over multiple turns and computational requirements.
Our study reveals strategies that consistently improve performance across all models with small and large sampling budgets.
arXiv Detail & Related papers (2024-10-10T16:53:10Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Achieving Peak Performance for Large Language Models: A Systematic Review [0.0]
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP)
As models grow into the trillion- parameter range, computational and memory costs increase significantly.
This makes it difficult for many researchers to access the resources needed to train or apply these models.
arXiv Detail & Related papers (2024-09-07T13:57:41Z) - LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement [93.38736019287224]
"LLMs-as-Instructors" framework autonomously enhances the training of smaller target models.
Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model.
Within this framework, we implement two strategies: "Learning from Error," which focuses solely on incorrect responses to tailor training data, and "Learning from Error by Contrast", which uses contrastive learning to analyze both correct and incorrect responses for a deeper understanding of errors.
arXiv Detail & Related papers (2024-06-29T17:16:04Z) - Multiplicative update rules for accelerating deep learning training and
increasing robustness [69.90473612073767]
We propose an optimization framework that fits to a wide range of machine learning algorithms and enables one to apply alternative update rules.
We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule.
arXiv Detail & Related papers (2023-07-14T06:44:43Z) - Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization.
First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs.
For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Precise Learning of Source Code Contextual Semantics via Hierarchical
Dependence Structure and Graph Attention Networks [28.212889828892664]
We propose a novel source code model embedded with hierarchical dependencies.
We introduce the syntactic structural of the basic block, i.e., its corresponding AST, in source code model to provide sufficient information.
The results show that our model reduces the scale of parameters by 50% and achieves 4% improvement on accuracy on program classification task.
arXiv Detail & Related papers (2021-11-20T04:03:42Z) - Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods.
This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z) - MLComp: A Methodology for Machine Learning-based Performance Estimation
and Adaptive Selection of Pareto-Optimal Compiler Optimization Sequences [10.200899224740871]
We propose a novel Reinforcement Learning-based policy methodology for embedded software optimization.
We show that different Machine Learning models are automatically tested to choose the best-fitting one.
We also show that our framework can be trained efficiently for any target platform and application domain.
arXiv Detail & Related papers (2020-12-09T19:13:39Z) - A Learned Performance Model for Tensor Processing Units [5.733911161090224]
We demonstrate a method of learning performance models from a corpus of graph programs for Processing Unit (TPU) instances.
We show that our learned model outperforms a heavily-optimized analytical performance model on two tasks.
It helps an autotuner discover faster programs in a setting where access to TPUs is limited or expensive.
arXiv Detail & Related papers (2020-08-03T17:24:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.