Related papers: ECO: An LLM-Driven Efficient Code Optimizer for Warehouse Scale Computers

ECO: An LLM-Driven Efficient Code Optimizer for Warehouse Scale Computers

URL: http://arxiv.org/abs/2503.15669v1
Date: Wed, 19 Mar 2025 19:52:35 GMT
Title: ECO: An LLM-Driven Efficient Code Optimizer for Warehouse Scale Computers
Authors: Hannah Lin, Martin Maas, Maximilian Roquemore, Arman Hasanzadeh, Fred Lewis, Yusuf Simonson, Tzu-Wei Yang, Amir Yazdanbakhsh, Deniz Altinbüken, Florin Papa, Maggie Nolan Edmonds, Aditya Patil, Don Schwarz, Satish Chandra, Chris Kennelly, Milad Hashemi, Parthasarathy Ranganathan,
Abstract summary: This paper introduces ECO (Efficient Code), a system that automatically source code to improve performance at scale.<n>Over the past year, ECO has consistently resulted in significant performance savings every quarter.<n>On average, the savings produced per quarter are equivalent to over 500k normalized CPU cores.
Score: 13.56820317396104
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the end of Moore's Law, optimizing code for performance has become paramount for meeting ever-increasing compute demands, particularly in hyperscale data centers where even small efficiency gains translate to significant resource and energy savings. Traditionally, this process requires significant programmer effort to identify optimization opportunities, modify the code to implement the optimization, and carefully deploy and measure the optimization's impact. Despite a significant amount of work on automating program edits and promising results in small-scale settings, such performance optimizations have remained elusive in large real-world production environments, due to the scale, high degree of complexity, and reliability required. This paper introduces ECO (Efficient Code Optimizer), a system that automatically refactors source code to improve performance at scale. To achieve these performance gains, ECO searches through historical commits at scale to create a dictionary of performance anti-patterns that these commits addressed. These anti-patterns are used to search for similar patterns in a code base of billions of lines of code, pinpointing other code segments with similar potential optimization opportunities. Using a fine-tuned LLM, ECO then automatically refactors the code to generate and apply similar edits. Next, ECO verifies the transformed code, submits it for code review, and measures the impact of the optimization in production. Currently deployed on Google's hyperscale production fleet, this system has driven >25k changed lines of production code, across over 6.4k submitted commits, with a >99.5% production success rate. Over the past year, ECO has consistently resulted in significant performance savings every quarter. On average, the savings produced per quarter are equivalent to over 500k normalized CPU cores.

Related papers

PerfCoder: Large Language Models for Interpretable Code Performance Optimization [15.79612555952707]
PerfCoder is a family of large language models (LLMs) designed to generate performance-enhanced code from source code.<n>PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations.<n>PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate.
arXiv Detail & Related papers (2025-12-16T02:30:04Z)
ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs [10.020128936428078]
ECO is a performance-aware prompting framework for code optimization.<n>Our empirical studies highlight that ECO prompting significantly improves code-LLMs' ability to generate efficient code.
arXiv Detail & Related papers (2025-10-12T09:29:24Z)
GA4GC: Greener Agent for Greener Code via Multi-Objective Configuration Optimization [3.3200397756832047]
This paper introduces GA4GC, the first framework to systematically optimize coding agent runtime (greener agent) and code performance (greener code) trade-offs.<n> Evaluation on the SWE-Perf benchmark demonstrates up to 135x hypervolume improvement, reducing agent runtime by 37.7% while improving correctness.
arXiv Detail & Related papers (2025-10-05T10:34:30Z)
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization [46.33639431414019]
Large Language Models generate functionally correct solutions but often fall short in code efficiency.<n>We introduce a novel test-time iterative optimization framework to address this.
arXiv Detail & Related papers (2025-05-29T12:14:29Z)
ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement [1.8749305679160366]
ARCS integrates Retrieval-Augmented Generation with Chain-of-Thought reasoning. Agent-based RAG mechanism retrieves relevant code snippets. Real-time execution feedback drives the synthesis of candidate solutions.
arXiv Detail & Related papers (2025-04-29T05:15:52Z)
Can We Make Code Green? Understanding Trade-Offs in LLMs vs. Human Code Optimizations [45.243401722182554]
Large language models (LLMs) claim to assist developers in optimizing code for performance and energy efficiency. This work focuses on software written in Matlab-widely used in both academia and industry for scientific and engineering applications. We analyze energy-focused optimization on 400 scripts across 100 top GitHub repositories.
arXiv Detail & Related papers (2025-03-26T00:27:29Z)
Reward-Guided Speculative Decoding for Efficient LLM Reasoning [80.55186052123196]
We introduce Reward-Guided Speculative Decoding (RSD), a novel framework aimed at improving the efficiency of inference in large language models (LLMs)<n>RSD incorporates a controlled bias to prioritize high-reward outputs, in contrast to existing speculative decoding methods that enforce strict unbiasedness.<n>RSD delivers significant efficiency gains against decoding with the target model only, while achieving significant better accuracy than parallel decoding method on average.
arXiv Detail & Related papers (2025-01-31T17:19:57Z)
Optimizing Code Runtime Performance through Context-Aware Retrieval-Augmented Generation [8.574686422653345]
Auto achieves a 7.3% improvement in execution efficiency over GPT-4o across common generated executable code.<n>This study introduces an in-context learning approach designed to bridge the gap by enabling LLMs to automatically generate optimized code.
arXiv Detail & Related papers (2025-01-28T04:00:35Z)
Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement [47.89758553708932]
We introduce textbfThinkCoder, a framework that combines thorough exploration with optimal refinement.<n>The exploration phase diversifies the solution space by searching for potential solutions, followed by a refinement phase that enhances precision.<n>To further minimize test-time computation overhead, we introduce preference-driven optimization with Reinforced Self-Training (ReST)
arXiv Detail & Related papers (2024-12-30T07:02:15Z)
Less is More: Towards Green Code Large Language Models via Unified Structural Pruning [27.428983811427827]
We propose Flab-Pruner, an innovative unified structural pruning method that combines vocabulary, layer, and Feed-Forward Network (FFN) pruning. The results demonstrate that Flab-Pruner retains 97% of the original performance after pruning 22% of the parameters and achieves the same or even better performance after post-training.
arXiv Detail & Related papers (2024-12-20T14:13:09Z)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback [78.89596149768458]
Large Language Models (LLMs) are widely adopted for assisting in software development tasks. We propose PerfCodeGen, a training-free framework that enhances the performance of LLM-generated code.
arXiv Detail & Related papers (2024-11-18T06:22:38Z)
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency. CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z)
Measuring Code Efficiency Optimization Capabilities with ACEOB [7.4056083791645495]
We conduct an in-depth analysis of "code patterns" in the model training dataset, meticulously exploring human-written code. We introduce the Automatic Code Efficiency Optimization Benchmark (ACEOB), which consists of 95,359 pairs of efficient-inefficient code. To our knowledge, ACEOB is the first dataset specifically targeting Python code efficiency optimization.
arXiv Detail & Related papers (2024-08-23T10:10:37Z)
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark [166.40879020706151]
This paper proposes a shift towards BP-free, zeroth-order (ZO) optimization as a solution for reducing memory costs during fine-tuning. Unlike traditional ZO-SGD methods, our work expands the exploration to a wider array of ZO optimization techniques. Our study unveils previously overlooked optimization principles, highlighting the importance of task alignment, the role of the forward gradient method, and the balance between algorithm complexity and fine-tuning performance.
arXiv Detail & Related papers (2024-02-18T14:08:48Z)
Mercury: A Code Efficiency Benchmark for Code Large Language Models [41.51235610016959]
We present Mercury, the first code efficiency benchmark for Large Language Models for Code (Code LLMs) It comprises 1,889 Python tasks, each accompanied by adequate solutions that serve as real-world efficiency baselines. We introduce a new metric Beyond, which computes a runtime-percentile-weighted Pass score to reflect functional correctness and code efficiency simultaneously.
arXiv Detail & Related papers (2024-02-12T17:53:22Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.