Related papers: ACECode: A Reinforcement Learning Framework for Aligning Code Efficiency and Correctness in Code Language Models

ACECode: A Reinforcement Learning Framework for Aligning Code Efficiency and Correctness in Code Language Models

URL: http://arxiv.org/abs/2412.17264v1
Date: Mon, 23 Dec 2024 04:19:45 GMT
Title: ACECode: A Reinforcement Learning Framework for Aligning Code Efficiency and Correctness in Code Language Models
Authors: Chengran Yang, Hong Jin Kang, Jieke Shi, David Lo,
Abstract summary: Existing approaches for optimizing code efficiency for CodeLLMs like SOAP and PIE exhibit certain limitations.<n>We introduce ACECode, a reinforcement learning-based fine-tuning framework that aligns CodeLLMs with dual objectives of efficiency and correctness.<n>We evaluate ACECode by fine-tuning four SOTA (state-of-the-art) CodeLLMs and comparing their code with three baselines: original, instruction-tuned, and PIE-tuned CodeLLMs.
Score: 9.4219427550154
License: http://creativecommons.org/licenses/by/4.0/
Abstract: CodeLLMs have demonstrated remarkable advancements in software engineering tasks. However, while these models can generate functionally correct code, they often produce code that is inefficient in terms of runtime. This inefficiency is particularly problematic in resource-constrained environments, impacting software performance and sustainability. Existing approaches for optimizing code efficiency for CodeLLMs like SOAP and PIE exhibit certain limitations. SOAP requires a compatible execution environment and predefined test cases for iterative code modification, while PIE focuses on instruction tuning, improving efficiency but compromising correctness. These shortcomings highlight the need for a fine-tuning framework that optimizes both efficiency and correctness without relying on predefined test cases or specific execution environments. To bridge this gap, we introduce ACECode, a reinforcement learning-based fine-tuning framework that aligns CodeLLMs with dual objectives of efficiency and correctness. ACECode combines three key steps: (1) generating code with an actor CodeLLM, (2) calculating a training-free reward signal derived from code execution feedback for each generated code, and (3) optimizing the CodeLLM via Proximal Policy Optimization (PPO) algorithm. This reward signal enables joint assessment of efficiency and correctness without manual labeling. We evaluate ACECode by fine-tuning four SOTA (state-of-the-art) CodeLLMs and comparing their code with three baselines: original, instruction-tuned, and PIE-tuned CodeLLMs. Extensive experiment results suggest that \tool{} significantly improves the efficiency and correctness of generated code against all baselines for all CodeLLMs. Specifically, CodeLLMs fine-tuned with ACECode improve pass@1 by 1.84% to 14.51% and reduce runtime in 65% to 72% of cases compared to original CodeLLMs.

Related papers

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization [46.33639431414019]
Large Language Models generate functionally correct solutions but often fall short in code efficiency.<n>We introduce a novel test-time iterative optimization framework to address this.
arXiv Detail & Related papers (2025-05-29T12:14:29Z)
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning [19.53507218261719]
Large Language Models (LLMs) have been widely adopted in commercial code completion engines. LLMs may generate code with quality issues that violate coding standards. We propose a novel comparative prefix-tuning method for controllable high-quality code generation.
arXiv Detail & Related papers (2025-03-12T03:15:46Z)
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models. We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts. Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z)
LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness [38.399282089600284]
Large Language Models (LLMs) have demonstrated impressive performance in code generation. tool: ulineLarge ulineLanguage ulineModel for Code ulineEfficiency is a novel framework that enables LLMs to generate code that balances both efficiency and correctness.
arXiv Detail & Related papers (2025-02-17T07:01:18Z)
Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement [47.89758553708932]
We introduce textbfThinkCoder, a framework that combines thorough exploration with optimal refinement.<n>The exploration phase diversifies the solution space by searching for potential solutions, followed by a refinement phase that enhances precision.<n>To further minimize test-time computation overhead, we introduce preference-driven optimization with Reinforced Self-Training (ReST)
arXiv Detail & Related papers (2024-12-30T07:02:15Z)
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs [56.4979142807426]
We introduce underlinetextbfDirect Preference Learning with Only underlinetextbfSelf-Generated underlinetextbfTests and underlinetextbfCode (DSTC) DSTC uses only self-generated code snippets and tests to construct reliable preference pairs.
arXiv Detail & Related papers (2024-11-20T02:03:16Z)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback [78.89596149768458]
Large Language Models (LLMs) are widely adopted for assisting in software development tasks.<n>We propose PerfCodeGen, a training-free framework that enhances the performance of LLM-generated code.
arXiv Detail & Related papers (2024-11-18T06:22:38Z)
Effi-Code: Unleashing Code Efficiency in Language Models [17.355845751737423]
Effi-Code is an approach to enhancing code generation in large language models. Effi-Code offers a scalable and generalizable approach to improving code generation in AI systems.
arXiv Detail & Related papers (2024-10-14T07:05:51Z)
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency. CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z)
Measuring Code Efficiency Optimization Capabilities with ACEOB [7.4056083791645495]
We conduct an in-depth analysis of "code patterns" in the model training dataset, meticulously exploring human-written code. We introduce the Automatic Code Efficiency Optimization Benchmark (ACEOB), which consists of 95,359 pairs of efficient-inefficient code. To our knowledge, ACEOB is the first dataset specifically targeting Python code efficiency optimization.
arXiv Detail & Related papers (2024-08-23T10:10:37Z)
Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization [81.88668100203913]
Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks. In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
Execution-based Code Generation using Deep Reinforcement Learning [8.085533911328577]
PPOCoder is a new framework for code generation that combines pre-trained PL models with Proximal Policy Optimization. PPOCoder seamlessly integrates external code-specific knowledge into the model optimization process. It's important to note that PPOCoder is a task-agnostic and model-agnostic framework that can be used across different code generation tasks and PLs.
arXiv Detail & Related papers (2023-01-31T18:02:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.