Related papers: Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Does Few-Shot Learning Help LLM Performance in Code Synthesis?

URL: http://arxiv.org/abs/2412.02906v1
Date: Tue, 03 Dec 2024 23:19:40 GMT
Title: Does Few-Shot Learning Help LLM Performance in Code Synthesis?
Authors: Derek Xu, Tong Xie, Botao Xia, Haoyu Li, Yunsheng Bai, Yizhou Sun, Wei Wang,
Abstract summary: This work focuses on the few-shot examples present in most code generation prompts.<n>Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED.<n>Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark.
Score: 40.35198206199065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for coding. This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples. Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED. The 2 methods offer a trade-off between improved performance and reliance on training data and interpretability. Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark. In summary, our work provides valuable insights into how to pick few-shot examples in code generation prompts to improve LLM code generation capabilities.

Related papers

Leveraging Metamemory Mechanisms for Enhanced Data-Free Code Generation in LLMs [44.80420740455364]
M2WF is a framework for improving large language models' one-time code generation. Unlike prior methods, it minimizes dependency on curated data and adapts to various coding scenarios. The code and framework will be publicly available on GitHub and HuggingFace.
arXiv Detail & Related papers (2025-01-14T07:16:43Z)
Selective Shot Learning for Code Explanation [4.773934813915903]
State-of-the-art approaches for Selective Shot Learning (SSL) include token-based and embedding-based methods. We present a comparative study and propose a novel SSL method (SSL_ner) that utilizes entity information for few-shot example selection. We show the effectiveness of SSL_ner over state-of-the-art methods across two datasets.
arXiv Detail & Related papers (2024-12-17T12:26:14Z)
The First Prompt Counts the Most! An Evaluation of Large Language Models on Iterative Example-based Code Generation [33.77058239791512]
This paper presents the first comprehensive study on example-based code generation using Large Language Models (LLMs) To address the incorrectness caused by the incompleteness of I/O examples, we adopt an iterative evaluation framework. We assess six state-of-the-art LLMs using a new benchmark of 168 diverse target functionalities.
arXiv Detail & Related papers (2024-11-11T08:05:37Z)
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems. While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited. We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z)
EPiC: Cost-effective Search-based Prompt Engineering of LLMs for Code Generation [8.009881267479189]
Large Language Models (LLMs) have seen increasing use in various software development tasks, especially in code generation. We propose an alternative approach named Evolutionary Prompt Engineering for Code (EPiC) to evolve the original prompts toward better ones that produce high-quality code. Our evaluation against state-of-the-art (SOTA) LLM-based code generation models shows that EPiC outperforms all the baselines in terms of cost-effectiveness.
arXiv Detail & Related papers (2024-08-20T21:15:36Z)
Case2Code: Scalable Synthetic Data for Code Generation [105.89741089673575]
Large Language Models (LLMs) have shown outstanding breakthroughs in code generation. Recent work improves code LLMs by training on synthetic data generated by some powerful LLMs. We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z)
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data [64.69872638349922]
We present AlchemistCoder, a series of Code LLMs with enhanced code generation and generalization capabilities fine-tuned on multi-source data. We propose incorporating the data construction process into the fine-tuning data as code comprehension tasks, including instruction evolution, data filtering, and code review.
arXiv Detail & Related papers (2024-05-29T16:57:33Z)
SEED: Customize Large Language Models with Sample-Efficient Adaptation for Code Generation [35.88318116340547]
We propose a novel adaptation approach named SEED, which stands for Sample-Efficient adaptation with Error-Driven learning for code generation. We show that SEED achieves superior performance with few training samples, showing an average relative improvement of 54.7% in Pass@1 on multiple code generation benchmarks.
arXiv Detail & Related papers (2024-02-29T16:09:02Z)
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning [36.78560777629329]
We introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation. It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability. Our model achieves superior performance on the HumanEval and MBPP benchmarks.
arXiv Detail & Related papers (2024-02-14T12:34:58Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation [4.310519298899164]
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks. For the zero-shot setting, instructed LLMs are very competitive on code comprehension and generation tasks. For the few-shot setting, we find that adding demonstration examples substantially helps instructed LLMs perform better.
arXiv Detail & Related papers (2023-08-02T15:54:22Z)
Learning to Retrieve In-Context Examples for Large Language Models [69.9707552694766]
Large language models (LLMs) have demonstrated their ability to learn in-context. The effectiveness of in-context learning is heavily reliant on the quality of the selected examples. We propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples.
arXiv Detail & Related papers (2023-07-14T05:23:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.