CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement
- URL: http://arxiv.org/abs/2411.05199v2
- Date: Thu, 19 Dec 2024 18:46:21 GMT
- Title: CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement
- Authors: Leitian Tao, Xiang Chen, Tong Yu, Tung Mai, Ryan Rossi, Yixuan Li, Saayan Mitra,
- Abstract summary: Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize.
We introduce CodeLutra, a framework that leverages both correct and incorrect code attempts.
By learning from both successes and mistakes, CodeLutra provides a scalable and efficient path to high-quality code generation.
- Score: 32.46078765471136
- License:
- Abstract: Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective alternative. However, standard supervised approaches rely only on correct examples, missing valuable insights from failures. We introduce CodeLutra, a framework that leverages both correct and incorrect code attempts. Instead of using only correct solutions, CodeLutra applies iterative preference-based refinement, comparing successful and failed outputs to better approximate desired results. This approach narrows the performance gap with state-of-the-art larger models without requiring massive datasets or auxiliary models. For instance, on a challenging data science coding task, using only 500 samples improved Llama-3-8B's accuracy from 28.2% to 48.6%, approaching GPT-4's level. By learning from both successes and mistakes, CodeLutra provides a scalable and efficient path to high-quality code generation, making smaller open-source models more competitive with leading closed-source alternatives.
Related papers
- Does Few-Shot Learning Help LLM Performance in Code Synthesis? [40.35198206199065]
This work focuses on the few-shot examples present in most code generation prompts.
Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED.
Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark.
arXiv Detail & Related papers (2024-12-03T23:19:40Z) - OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - Effi-Code: Unleashing Code Efficiency in Language Models [17.355845751737423]
Effi-Code is an approach to enhancing code generation in large language models.
Effi-Code offers a scalable and generalizable approach to improving code generation in AI systems.
arXiv Detail & Related papers (2024-10-14T07:05:51Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation [0.0]
Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks.
LLMs' substantial computational and memory requirements often make them inaccessible to users with limited resources.
This paper focuses on very low-cost models which offer a more accessible alternative to resource-intensive LLMs.
arXiv Detail & Related papers (2024-04-17T08:16:48Z) - SEED: Customize Large Language Models with Sample-Efficient Adaptation for Code Generation [35.88318116340547]
We propose a novel adaptation approach named SEED, which stands for Sample-Efficient adaptation with Error-Driven learning for code generation.
We show that SEED achieves superior performance with few training samples, showing an average relative improvement of 54.7% in Pass@1 on multiple code generation benchmarks.
arXiv Detail & Related papers (2024-02-29T16:09:02Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.
We build a novel data-cleaning pipeline that uses these principles to transform existing programs.
We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z) - Large Language Model-Aware In-Context Learning for Code Generation [75.68709482932903]
Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation.
We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
arXiv Detail & Related papers (2023-10-15T06:12:58Z) - Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods.
This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.