What Builds Effective In-Context Examples for Code Generation?
- URL: http://arxiv.org/abs/2508.06414v1
- Date: Fri, 08 Aug 2025 15:58:11 GMT
- Title: What Builds Effective In-Context Examples for Code Generation?
- Authors: Dongze Li, Songqiang Chen, Jialun Cao, Shing-Chi Cheung,
- Abstract summary: In-Context Learning (ICL) has emerged as a promising solution to enhance the code generation capabilities of Large Language Models (LLMs)<n>This paper systematically investigates the impact of various code features on ICL with code examples through controlled ablation studies.
- Score: 8.536350550103057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-Context Learning (ICL) has emerged as a promising solution to enhance the code generation capabilities of Large Language Models (LLMs), which incorporates code examples inside the prompt to let LLMs learn from demonstrations. However, despite the substantial effectiveness of the code example-based ICL approach, the specific features (e.g., identifier naming styles, code formatting, solution insight) within the ICL-provided code examples that significantly contribute to the ICL's effectiveness remain unclear. This paper systematically investigates the impact of various code features on ICL with code examples through controlled ablation studies. Our findings reveal that the appropriate naming of variables and functions is crucial for effective code generation, with their elimination leading to performance decreases of up to 30 percentage points. We further demonstrate that LLMs prioritize semantically meaningful identifier names over formatting conventions, with language-specific preferences regarding identifier verbosity. Additionally, our investigation into ICL's potential for enhancing reflection and inference capabilities reveals that current LLMs struggle to extract generalizable problem-solving insights from similar code solutions, despite being capable of utilizing direct information effectively. These findings are expected to provide valuable insights for optimizing ICL systems in code generation applications and highlight fundamental challenges in reflection-based learning for code generation tasks.
Related papers
- Counting Hypothesis: Potential Mechanism of In-Context Learning [0.4583541422554718]
In-Context Learning (ICL) indicates that large language models (LLMs) pretrained on a massive amount of data can learn specific tasks from input prompts' examples.<n>We propose 1the counting hypothesis' of ICL, which suggests that LLMs' encoding strategy may underlie ICL.
arXiv Detail & Related papers (2026-02-02T05:57:33Z) - Protocode: Prototype-Driven Interpretability for Code Generation in LLMs [5.8296917468117835]
Large Language Models (LLMs) have been widely adopted for various tasks such as text summarization, question answering, speech-to-text translation, and more.<n>Our work focuses on automatically sampling In-Context Learning (ICL) demonstrations which can improve model performance and enhance the interpretability of the generated code.
arXiv Detail & Related papers (2025-09-27T00:32:45Z) - Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models [24.14163275602762]
We focus on investigating the usefulness of trace-based semantic information in boosting supervised fine-tuning(SFT) and post-phase inference of Code LLMs.<n>The experimental results surprisingly disagree with previous works and demonstrate that semantic information has limited usefulness for SFT and test time scaling of Code LLM.
arXiv Detail & Related papers (2025-09-15T08:38:01Z) - MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning [53.02571749383208]
In-Context Learning (ICL) empowers Large Language Models (LLMs) to tackle diverse tasks by incorporating multiple input-output examples.<n>Many-Shot Adaptive Pseudo-LabEling (MAPLE) is a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.
arXiv Detail & Related papers (2025-05-22T04:54:27Z) - SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors [5.247363735860479]
Large language models (LLMs) have demonstrated remarkable capabilities in code-related tasks.<n>Given LLMs' ability to understand and process diverse programs, they present a promising direction for building general-purpose surrogate models.<n>We introduce SURGE, a benchmark with $1160$ problems covering $8$ key aspects.<n>Through empirical analysis of $21$ open-source and proprietary LLMs, we examine scaling laws, data efficiency, and predictive accuracy.
arXiv Detail & Related papers (2025-02-16T15:38:19Z) - The First Prompt Counts the Most! An Evaluation of Large Language Models on Iterative Example-Based Code Generation [33.77058239791512]
This paper presents the first comprehensive study on example-based code generation using Large Language Models (LLMs)<n>We adopt an iterative evaluation framework and formalize the objective of example-based code generation as two sequential sub-objectives.<n>We assess six state-of-the-art LLMs using a new benchmark of 172 diverse target functionalities.
arXiv Detail & Related papers (2024-11-11T08:05:37Z) - Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - What Makes Good In-context Demonstrations for Code Intelligence Tasks
with LLMs? [60.668318972782295]
Large language models have shown the ability of in-context learning (ICL)
ICL employs task instructions and a few examples as demonstrations, and then inputs the demonstrations to the language models for making predictions.
It is important to systematically investigate how to construct a good demonstration for code-related tasks.
arXiv Detail & Related papers (2023-04-15T15:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.