Target-independent XLA optimization using Reinforcement Learning
- URL: http://arxiv.org/abs/2308.14364v1
- Date: Mon, 28 Aug 2023 07:23:03 GMT
- Title: Target-independent XLA optimization using Reinforcement Learning
- Authors: Milan Ganai, Haichen Li, Theodore Enns, Yida Wang, Randy Huang
- Abstract summary: This paper introduces deep Reinforcement Learning based search for optimal XLA HLO pass ordering.
We also propose enhancements to the deep RL algorithms to further improve optimal search performance.
Overall, in our experimentation we observe an average of $13.3%$ improvement in operation count reduction.
- Score: 6.442130495735239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An important challenge in Machine Learning compilers like XLA is multi-pass
optimization and analysis. There has been recent interest chiefly in XLA
target-dependent optimization on the graph-level, subgraph-level, and
kernel-level phases. We specifically focus on target-independent optimization
XLA HLO pass ordering: our approach aims at finding the optimal sequence of
compiler optimization passes, which is decoupled from target-dependent
optimization. However, there is little domain specific study in pass ordering
for XLA HLO. To this end, we propose introducing deep Reinforcement Learning
(RL) based search for optimal XLA HLO pass ordering. We also propose
enhancements to the deep RL algorithms to further improve optimal search
performance and open the research direction for domain-specific guidance for
RL. We create an XLA Gym experimentation framework as a tool to enable RL
algorithms to interact with the compiler for passing optimizations and thereby
train agents. Overall, in our experimentation we observe an average of $13.3\%$
improvement in operation count reduction on a benchmark of GPT-2 training
graphs and $10.4\%$ improvement on a diverse benchmark including GPT-2, BERT,
and ResNet graphs using the proposed approach over the compiler's default phase
ordering.
Related papers
- Meta-Learning Objectives for Preference Optimization [39.15940594751445]
We show that it is possible to gain insights on the efficacy of preference optimization algorithms on simpler benchmarks.
We propose a novel family of PO algorithms based on mirror descent, which we call Mirror Preference Optimization (MPO)
arXiv Detail & Related papers (2024-11-10T19:11:48Z) - A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler [0.10923877073891444]
We introduce the first RL environment for the MLIR compiler, dedicated to facilitating MLIR compiler research.
We also propose a novel formulation of the action space as a product of simpler action subspaces, enabling more efficient and effective optimizations.
arXiv Detail & Related papers (2024-09-17T10:49:45Z) - AIPO: Improving Training Objective for Iterative Preference Optimization [34.24211649396053]
We study iterative preference optimization with synthetic data.
We propose our training objective for iterative preference optimization, namely Agreement-aware Iterative Preference Optimization (AIPO)
arXiv Detail & Related papers (2024-09-13T14:03:49Z) - LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.
In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts.
We identify two pivotal factors in model parameter learning: update direction and update method.
We develop a capable Gradient-inspired Prompt-based GPO.
arXiv Detail & Related papers (2024-02-27T15:05:32Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence [31.59453616577858]
We show that a distributed system can withstand noisier zeroth-order agents but can even benefit from such agents into the optimization process.
Our results hold both convex and non-zero-th order optimization objectives while they could still contribute to joint optimization tasks.
arXiv Detail & Related papers (2022-10-14T10:54:11Z) - Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods.
This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.