Related papers: Protean Compiler: An Agile Framework to Drive Fine-grain Phase Ordering

Protean Compiler: An Agile Framework to Drive Fine-grain Phase Ordering

URL: http://arxiv.org/abs/2602.06142v1
Date: Thu, 05 Feb 2026 19:24:05 GMT
Title: Protean Compiler: An Agile Framework to Drive Fine-grain Phase Ordering
Authors: Amir H. Ashouri, Shayan Shirahmad Gale Bagi, Kavin Satheeskumar, Tejas Srikanth, Jonathan Zhao, Ibrahim Saidoun, Ziwen Wang, Bryan Chan, Tomasz S. Czajkowski,
Abstract summary: Protean Compiler is an agile framework to enable LLVM with built-in phase-ordering capabilities at a fine-grained scope.<n>The framework also comprises a complete library of more than 140 handcrafted static feature collection methods at varying scopes.<n>This paper showcases speedup gains of up to 4.1% on average and up to 15.7% on select Cbench applications wrt LLVM's O3.
Score: 2.5829132714658067
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The phase ordering problem has been a long-standing challenge since the late 1970s, yet it remains an open problem due to having a vast optimization space and an unbounded nature, making it an open-ended problem without a finite solution, one can limit the scope by reducing the number and the length of optimizations. Traditionally, such locally optimized decisions are made by hand-coded algorithms tuned for a small number of benchmarks, often requiring significant effort to be retuned when the benchmark suite changes. In the past 20 years, Machine Learning has been employed to construct performance models to improve the selection and ordering of compiler optimizations, however, the approaches are not baked into the compiler seamlessly and never materialized to be leveraged at a fine-grained scope of code segments. This paper presents Protean Compiler: An agile framework to enable LLVM with built-in phase-ordering capabilities at a fine-grained scope. The framework also comprises a complete library of more than 140 handcrafted static feature collection methods at varying scopes, and the experimental results showcase speedup gains of up to 4.1% on average and up to 15.7% on select Cbench applications wrt LLVM's O3 by just incurring a few extra seconds of build time on Cbench. Additionally, Protean compiler allows for an easy integration with third-party ML frameworks and other Large Language Models, and this two-step optimization shows a gain of 10.1% and 8.5% speedup wrt O3 on Cbench's Susan and Jpeg applications. Protean compiler is seamlessly integrated into LLVM and can be used as a new, enhanced, full-fledged compiler. We plan to release the project to the open-source community in the near future.

Related papers

OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization [21.882017397032964]
We present OptiML, an end-to-end framework that maps either natural-language intent or input code to performance-optimized kernels.<n>A search-based (OptiML-X) then refines either synthesized or user-provided kernels using Monte Carlo Tree Search over LLM-aware, guided by a hardware-driven reward derived from profiler feedback.
arXiv Detail & Related papers (2026-02-12T04:50:19Z)
nncase: An End-to-End Compiler for Efficient LLM Deployment on Heterogeneous Storage Architectures [7.460240094212613]
We present nncase, an end-to-end compilation framework designed to unify optimization across diverse targets.<n>nncase integrates three key modules: Auto Vectorize for adapting to heterogeneous computing units, Auto Distribution for searching parallel strategies, and Auto Schedule for maximizing on-chip cache locality.
arXiv Detail & Related papers (2025-12-25T08:27:53Z)
QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code [52.66657751895655]
Large Language Models (LLMs) offer a compelling new paradigm: Neural Compilation.<n>This paper introduces NeuComBack, a novel benchmark dataset specifically designed for IR-to-assembly compilation.<n>We propose a self-evolving prompt optimization method that enables LLMs to evolve their internal prompt strategies.
arXiv Detail & Related papers (2025-11-03T03:20:26Z)
EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation [84.70637613266835]
EoRA is a fine-tuning-free method that augments compressed Large Language Models with low-rank matrices.<n>EoRA consistently outperforms prior training-free low rank methods in recovering the accuracy of compressed LLMs.
arXiv Detail & Related papers (2024-10-28T17:59:03Z)
ACPO: AI-Enabled Compiler Framework [1.752593459729982]
This paper presents ACPO: An AI-Enabled Compiler Framework.<n>It provides LLVM with simple and comprehensive tools to benefit from employing ML models for different optimization passes.<n>We show that ACPO can provide a combined speedup of 4.5% on Polybench and 2.4% on Cbench when compared with LLVM's O3.
arXiv Detail & Related papers (2023-12-15T17:49:24Z)
The Next 700 ML-Enabled Compiler Optimizations [0.9536052347069729]
We propose ML-Compiler-Bridge to enable ML model development within a traditional Python framework. We evaluate it on both research and production use cases, for training and inference, over several optimization problems, multiple compilers and its versions, and gym infrastructures.
arXiv Detail & Related papers (2023-11-17T08:27:17Z)
SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design [0.685316573653194]
This paper introduces Sparser, a novel deep learning that exploits Moreau-Yosida regularization to induce sparsity in large language models such as BERT, ALBERT and GPT. Sparser's plug-and-play functionality eradicates the need for code modifications, making it a universally adaptable tool for a wide array of large language models. Empirical evaluations on benchmark datasets such as GLUE, RACE, SQuAD1, and SQuAD2 confirm that SBERT and Sparser, when sparsified using Sparser, achieve performance comparable to their dense counterparts
arXiv Detail & Related papers (2023-06-27T17:50:26Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
Learning to Superoptimize Real-world Programs [79.4140991035247]
We propose a framework to learn to superoptimize real-world programs by using neural sequence-to-sequence models. We introduce the Big Assembly benchmark, a dataset consisting of over 25K real-world functions mined from open-source projects in x86-64 assembly.
arXiv Detail & Related papers (2021-09-28T05:33:21Z)
Enabling Retargetable Optimizing Compilers for Quantum Accelerators via a Multi-Level Intermediate Representation [78.8942067357231]
We present a multi-level quantum-classical intermediate representation (IR) that enables an optimizing, retargetable, ahead-of-time compiler. We support the entire gate-based OpenQASM 3 language and provide custom extensions for common quantum programming patterns and improved syntax. Our work results in compile times that are 1000x faster than standard Pythonic approaches, and 5-10x faster than comparative standalone quantum language compilers.
arXiv Detail & Related papers (2021-09-01T17:29:47Z)
A Reinforcement Learning Environment for Polyhedral Optimizations [68.8204255655161]
We propose a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP) Instead of using transformations, the formulation is based on an abstract space of possible schedules. Our generic MDP formulation enables using reinforcement learning to learn optimization policies over a wide range of loops.
arXiv Detail & Related papers (2021-04-28T12:41:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.