Related papers: PerfCoder: Large Language Models for Interpretable Code Performance Optimization

PerfCoder: Large Language Models for Interpretable Code Performance Optimization

URL: http://arxiv.org/abs/2512.14018v1
Date: Tue, 16 Dec 2025 02:30:04 GMT
Title: PerfCoder: Large Language Models for Interpretable Code Performance Optimization
Authors: Jiuding Yang, Shengyao Lu, Hongxuan Liu, Shayan Shirahmad Gale Bagi, Zahra Fazel, Tomasz Czajkowski, Di Niu,
Abstract summary: PerfCoder is a family of large language models (LLMs) designed to generate performance-enhanced code from source code.<n>PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations.<n>PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate.
Score: 15.79612555952707
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have achieved remarkable progress in automatic code generation, yet their ability to produce high-performance code remains limited--a critical requirement in real-world software systems. We argue that current LLMs struggle not only due to data scarcity but, more importantly, because they lack supervision that guides interpretable and effective performance improvements. In this work, we introduce PerfCoder, a family of LLMs specifically designed to generate performance-enhanced code from source code via interpretable, customized optimizations. PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations, and preference-aligned by reinforcement fine-tuning using runtime measurements, enabling it to propose input-specific improvement strategies and apply them directly without relying on iterative refinement. On the PIE code performance benchmark, PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate, demonstrating that performance optimization cannot be achieved by scale alone but requires optimization stratetgy awareness. In addition, PerfCoder can generate interpretable feedback about the source code, which, when provided as input to a larger LLM in a planner-and-optimizer cooperative workflow, can further improve outcomes. Specifically, we elevate the performance of 32B models and GPT-5 to new levels on code optimization, substantially surpassing their original performance.

Related papers

FasterPy: An LLM-based Code Execution Efficiency Optimization Framework [11.766544835516974]
Code often suffers from performance bugs.<n>Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs.<n>We propose FasterPy, a framework that adapts Large Language Models to optimize the execution efficiency of Python code.
arXiv Detail & Related papers (2025-12-28T07:43:08Z)
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization [46.33639431414019]
Large Language Models generate functionally correct solutions but often fall short in code efficiency.<n>We introduce a novel test-time iterative optimization framework to address this.
arXiv Detail & Related papers (2025-05-29T12:14:29Z)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback [78.89596149768458]
Large Language Models (LLMs) are widely adopted for assisting in software development tasks.<n>We propose PerfCodeGen, a training-free framework that enhances the performance of LLM-generated code.
arXiv Detail & Related papers (2024-11-18T06:22:38Z)
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency.<n>CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z)
Measuring Code Efficiency Optimization Capabilities with ACEOB [7.4056083791645495]
We conduct an in-depth analysis of "code patterns" in the model training dataset, meticulously exploring human-written code. We introduce the Automatic Code Efficiency Optimization Benchmark (ACEOB), which consists of 95,359 pairs of efficient-inefficient code. To our knowledge, ACEOB is the first dataset specifically targeting Python code efficiency optimization.
arXiv Detail & Related papers (2024-08-23T10:10:37Z)
A Problem-Oriented Perspective and Anchor Verification for Code Optimization [43.28045750932116]
Large language models (LLMs) have shown remarkable capabilities in solving various programming tasks.<n>This paper investigates the capabilities of LLMs in optimizing code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.<n>In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z)
Exploring Data-Efficient Adaptation of Large Language Models for Code Generation [64.5583894165813]
We propose a novel adaptation approach named DEED, which stands for Data-Efficient adaptation with Error-Driven learning for code generation.<n> Experimental results show that, compared to other mainstream fine-tuning approaches, DEED achieves superior performance with few training data.
arXiv Detail & Related papers (2024-02-29T16:09:02Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.