FasterPy: An LLM-based Code Execution Efficiency Optimization Framework
- URL: http://arxiv.org/abs/2512.22827v1
- Date: Sun, 28 Dec 2025 07:43:08 GMT
- Title: FasterPy: An LLM-based Code Execution Efficiency Optimization Framework
- Authors: Yue Wu, Minghao Han, Ruiyin Li, Peng Liang, Amjed Tahir, Zengyang Li, Qiong Feng, Mojtaba Shahin,
- Abstract summary: Code often suffers from performance bugs.<n>Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs.<n>We propose FasterPy, a framework that adapts Large Language Models to optimize the execution efficiency of Python code.
- Score: 11.766544835516974
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Code often suffers from performance bugs. These bugs necessitate the research and practice of code optimization. Traditional rule-based methods rely on manually designing and maintaining rules for specific performance bugs (e.g., redundant loops, repeated computations), making them labor-intensive and limited in applicability. In recent years, machine learning and deep learning-based methods have emerged as promising alternatives by learning optimization heuristics from annotated code corpora and performance measurements. However, these approaches usually depend on specific program representations and meticulously crafted training datasets, making them costly to develop and difficult to scale. With the booming of Large Language Models (LLMs), their remarkable capabilities in code generation have opened new avenues for automated code optimization. In this work, we proposed FasterPy, a low-cost and efficient framework that adapts LLMs to optimize the execution efficiency of Python code. FasterPy combines Retrieval-Augmented Generation (RAG), supported by a knowledge base constructed from existing performance-improving code pairs and corresponding performance measurements, with Low-Rank Adaptation (LoRA) to enhance code optimization performance. Our experimental results on the Performance Improving Code Edits (PIE) benchmark demonstrate that our method outperforms existing models on multiple metrics. The FasterPy tool and the experimental results are available at https://github.com/WuYue22/fasterpy.
Related papers
- PerfCoder: Large Language Models for Interpretable Code Performance Optimization [15.79612555952707]
PerfCoder is a family of large language models (LLMs) designed to generate performance-enhanced code from source code.<n>PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations.<n>PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate.
arXiv Detail & Related papers (2025-12-16T02:30:04Z) - LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness [38.399282089600284]
Large Language Models (LLMs) have demonstrated impressive performance in code generation.<n>tool: ulineLarge ulineLanguage ulineModel for Code ulineEfficiency is a novel framework that enables LLMs to generate code that balances both efficiency and correctness.
arXiv Detail & Related papers (2025-02-17T07:01:18Z) - PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback [78.89596149768458]
Large Language Models (LLMs) are widely adopted for assisting in software development tasks.<n>We propose PerfCodeGen, a training-free framework that enhances the performance of LLM-generated code.
arXiv Detail & Related papers (2024-11-18T06:22:38Z) - CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency.<n>CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z) - ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? [12.862825053595934]
ECCO is a benchmark for evaluating program efficiency via two paradigms: natural language (NL) based code generation and history-based code editing.
We find that adding execution information often helps maintain functional correctness, and NL feedback enhances more on efficiency.
arXiv Detail & Related papers (2024-07-19T05:47:40Z) - How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark [39.13045037676502]
Development of large language models (LLMs) has significantly pushed the frontiers of program synthesis.<n>Most evaluation frameworks focus on the (functional) correctness of generated code; efficiency, as an important measure of code quality, has been overlooked in existing evaluations.<n>We develop ENAMEL, a rigorous and high-standard benchmark for evaluating the capability of LLMs in generating efficient code.
arXiv Detail & Related papers (2024-06-10T04:19:20Z) - Exploring Data-Efficient Adaptation of Large Language Models for Code Generation [64.5583894165813]
We propose a novel adaptation approach named DEED, which stands for Data-Efficient adaptation with Error-Driven learning for code generation.<n> Experimental results show that, compared to other mainstream fine-tuning approaches, DEED achieves superior performance with few training data.
arXiv Detail & Related papers (2024-02-29T16:09:02Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - PerfRL: A Small Language Model Framework for Efficient Code Optimization [14.18092813639534]
In this paper, we introduce PerfRL, an innovative framework designed to tackle the problem of code optimization.<n>Our framework leverages the capabilities of small language models (SLMs) and reinforcement learning (RL)<n>Our approach achieves similar or better results compared to state-of-the-art models using shorter training times and smaller pre-trained models.
arXiv Detail & Related papers (2023-12-09T19:50:23Z) - LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.
We build a novel data-cleaning pipeline that uses these principles to transform existing programs.
We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z) - Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization.
First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs.
For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.