Related papers: DeepPERF: A Deep Learning-Based Approach For Improving Software Performance

DeepPERF: A Deep Learning-Based Approach For Improving Software Performance

URL: http://arxiv.org/abs/2206.13619v1
Date: Mon, 27 Jun 2022 20:35:52 GMT
Title: DeepPERF: A Deep Learning-Based Approach For Improving Software Performance
Authors: Spandan Garg, Roshanak Zilouchian Moghaddam, Colin B. Clement, Neel Sundaresan, Chen Wu
Abstract summary: We present DeepPERF, a transformer-based approach to suggest performance improvements for C# applications. Our evaluation shows that our model can generate the same performance improvement suggestion as the developer fix in 53% of the cases. We evaluate DeepPERF on 50 open source C# repositories on GitHub.
Score: 8.251500418379942
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Improving software performance is an important yet challenging part of the software development cycle. Today, the majority of performance inefficiencies are identified and patched by performance experts. Recent advancements in deep learning approaches and the wide-spread availability of open source data creates a great opportunity to automate the identification and patching of performance problems. In this paper, we present DeepPERF, a transformer-based approach to suggest performance improvements for C# applications. We pretrain DeepPERF on English and Source code corpora and followed by finetuning for the task of generating performance improvement patches for C# applications. Our evaluation shows that our model can generate the same performance improvement suggestion as the developer fix in ~53% of the cases, getting ~34% of them verbatim in our expert-verified dataset of performance changes made by C# developers. Additionally, we evaluate DeepPERF on 50 open source C# repositories on GitHub using both benchmark and unit tests and find that our model is able to suggest valid performance improvements that can improve both CPU usage and Memory allocations. So far we've submitted 19 pull-requests with 28 different performance optimizations and 11 of these PRs have been approved by the project owners.

Related papers

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? [32.67971774793393]
SWE-Perf is the first benchmark designed to evaluate Large Language Models (LLMs) on code performance optimization tasks within authentic repository contexts.<n>SWE-Perf comprises 140 carefully curated instances, each derived from performance-improving pull requests from popular GitHub repositories.
arXiv Detail & Related papers (2025-07-16T17:05:17Z)
Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency [4.292737608159482]
We present WEDGE, a framework for generating performance-stressing input given the program under test.<n>WEDGE synthesizes explicit performance-characterizing constraints in the form of branch conditions to partition the programs' execution space into performance-specific regions.<n>Our evaluation shows that WEDGE introduces a significant slowdown compared to the tests in CodeContests and those claimed to be optimized by existing approaches.
arXiv Detail & Related papers (2025-05-29T14:26:22Z)
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization [46.33639431414019]
Large Language Models generate functionally correct solutions but often fall short in code efficiency.<n>We introduce a novel test-time iterative optimization framework to address this.
arXiv Detail & Related papers (2025-05-29T12:14:29Z)
DebFlow: Automating Agent Creation via Agent Debate [3.7606626616500947]
DebFlow is a framework that employs a debate mechanism to optimize and integrates reflexion to improve. We evaluated our method across six benchmark datasets including HotpotQA, MATH, and ALFWorld. During training, our framework reduces resource consumption by 37% compared to the state-of-the-art baselines.
arXiv Detail & Related papers (2025-03-31T06:56:13Z)
Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation [4.573673188291683]
We present xPU-Shark, a fine-grained methodology for analyzing ML models at the machine-code level. xPU-Shark captures traces from production deployments running on accelerators and replays them in a modified microarchitecture simulator. We optimize a common communication collective by up to 15% and reduce token generation latency by up to 4.1%.
arXiv Detail & Related papers (2025-03-18T23:15:02Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL) Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z)
Patched MOA: optimizing inference for diverse software development tasks [1.14219428942199]
This paper introduces Patched MOA, an inference optimization technique that significantly enhances the performance of large language models (LLMs) We evaluate three inference optimization algorithms - Best of N, Mixture of Agents, and Monte Carlo Tree Search. We demonstrate that Patched MOA can boost the performance of smaller models to surpass that of larger, more expensive models.
arXiv Detail & Related papers (2024-07-26T05:34:34Z)
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? [12.862825053595934]
ECCO is a benchmark for evaluating program efficiency via two paradigms: natural language (NL) based code generation and history-based code editing. We find that adding execution information often helps maintain functional correctness, and NL feedback enhances more on efficiency.
arXiv Detail & Related papers (2024-07-19T05:47:40Z)
Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization [81.88668100203913]
Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks. In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
Towards General and Efficient Online Tuning for Spark [55.30868031221838]
We present a general and efficient Spark tuning framework that can deal with the three issues simultaneously. We have implemented this framework as an independent cloud service, and applied it to the data platform in Tencent.
arXiv Detail & Related papers (2023-09-05T02:16:45Z)
Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks [2.8961929092154697]
We test the performance of variouss on deep learning models for source code. We find that the choice of anahead can have a significant impact on the model quality. We suggest that the ML4SE community should consider using RAdam instead Adam as the default for code-related deep learning tasks.
arXiv Detail & Related papers (2023-03-06T22:49:20Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles. We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates. We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.