A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement
- URL: http://arxiv.org/abs/2409.05001v1
- Date: Sun, 8 Sep 2024 07:22:19 GMT
- Title: A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement
- Authors: Huan Zhang, Wei Cheng, Yuhan Wu, Wei Hu,
- Abstract summary: PairCoder is a novel framework for large language models (LLMs) to generate code.
It incorporates two collaborative agents, namely a Navigator agent for high-level planning and a Driver agent for specific implementation.
The Driver follows the guidance of Navigator to undertake initial code generation, code testing, and refinement.
- Score: 24.25119206488625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved impressive performance on code generation. Although prior studies enhanced LLMs with prompting techniques and code refinement, they still struggle with complex programming problems due to rigid solution plans. In this paper, we draw on pair programming practices to propose PairCoder, a novel LLM-based framework for code generation. PairCoder incorporates two collaborative LLM agents, namely a Navigator agent for high-level planning and a Driver agent for specific implementation. The Navigator is responsible for proposing promising solution plans, selecting the current optimal plan, and directing the next iteration round based on execution feedback. The Driver follows the guidance of Navigator to undertake initial code generation, code testing, and refinement. This interleaved and iterative workflow involves multi-plan exploration and feedback-based refinement, which mimics the collaboration of pair programmers. We evaluate PairCoder with both open-source and closed-source LLMs on various code generation benchmarks. Extensive experimental results demonstrate the superior accuracy of PairCoder, achieving relative pass@1 improvements of 12.00%-162.43% compared to prompting LLMs directly.
Related papers
- OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - EPiC: Cost-effective Search-based Prompt Engineering of LLMs for Code Generation [8.009881267479189]
Large Language Models (LLMs) have seen increasing use in various software development tasks, especially in code generation.
We propose an alternative approach named Evolutionary Prompt Engineering for Code (EPiC) to evolve the original prompts toward better ones that produce high-quality code.
Our evaluation against state-of-the-art (SOTA) LLM-based code generation models shows that EPiC outperforms all the baselines in terms of cost-effectiveness.
arXiv Detail & Related papers (2024-08-20T21:15:36Z) - AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data [64.69872638349922]
We present AlchemistCoder, a series of Code LLMs with enhanced code generation and generalization capabilities fine-tuned on multi-source data.
We propose incorporating the data construction process into the fine-tuning data as code comprehension tasks, including instruction evolution, data filtering, and code review.
arXiv Detail & Related papers (2024-05-29T16:57:33Z) - DolphCoder: Echo-Locating Code Large Language Models with Diverse and
Multi-Objective Instruction Tuning [36.78560777629329]
We introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation.
It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability.
Our model achieves superior performance on the HumanEval and MBPP benchmarks.
arXiv Detail & Related papers (2024-02-14T12:34:58Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - A Review of Repository Level Prompting for LLMs [0.0]
Large Language Models (LLMs) have led to notable successes, such as achieving a 94.6% solve rate on the HumanEval benchmark.
There is an increasing commercial push for repository-level inline code completion tools, such as GitHub Copilot and Tab Nine.
This paper delves into the transition from individual coding problems to repository-scale solutions.
arXiv Detail & Related papers (2023-12-15T00:34:52Z) - CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules [51.82044734879657]
We propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions.
We find that CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.
arXiv Detail & Related papers (2023-10-13T10:17:48Z) - CodeT5+: Open Code Large Language Models for Code Understanding and
Generation [72.1638273937025]
Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence.
CodeT5+ is a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks.
We extensively evaluate CodeT5+ on over 20 code-related benchmarks in different settings, including zero-shot, finetuning, and instruction-tuning.
arXiv Detail & Related papers (2023-05-13T14:23:07Z) - MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code
Completion [21.100570496144694]
We propose the MultiCoder to enhance the low-resource code completion via MultiPL pre-training and MultiPL Mixture-of-Experts layers.
We also propose a novel PL-level MoE routing strategy (PL-MoE) for improving the code completion on all PLs.
arXiv Detail & Related papers (2022-12-19T17:50:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.