Related papers: ReCode: Updating Code API Knowledge with Reinforcement Learning

ReCode: Updating Code API Knowledge with Reinforcement Learning

URL: http://arxiv.org/abs/2506.20495v2
Date: Thu, 17 Jul 2025 05:31:07 GMT
Title: ReCode: Updating Code API Knowledge with Reinforcement Learning
Authors: Haoze Wu, Yunzhi Yao, Wenhao Yu, Huajun Chen, Ningyu Zhang,
Abstract summary: Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs.<n>We propose ReCode, a novel framework that mimics human programmer adaptation to API changes.<n>Our experiments demonstrate that ReCode substantially boosts LLMs' code generation performance in dynamic API scenarios.
Score: 45.077641074621816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their training data, even with access to current documentation, impedes reliable code generation in dynamic environments. To tackle this issue, we propose ReCode (rule-based Reinforcement learning for Code Update), a novel framework that mimics human programmer adaptation to API changes. Specifically, we construct a dataset of approximately 2,000 data entries to train the LLMs to perform version migration based on updated information. Then, we introduce a modified string similarity metric for code evaluation as the reward for reinforcement learning. Our experiments demonstrate that ReCode substantially boosts LLMs' code generation performance in dynamic API scenarios, especially on the unseen CodeUpdateArena task. Crucially, compared to supervised fine-tuning, ReCode has less impact on LLMs' general code generation abilities. We apply ReCode on various LLMs and reinforcement learning algorithms (GRPO and DAPO), all achieving consistent improvements. Notably, after training, Qwen2.5-Coder-7B outperforms that of the 32B parameter code instruction-tuned model and the reasoning model with the same architecture. Code is available at https://github.com/zjunlp/ReCode.

Related papers

A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation. We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z)
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates [77.81663273436375]
We present CodeUpdateArena, a benchmark for knowledge editing in the code domain.<n>An instance in our benchmark consists of a synthetic API function update paired with a program synthesis example.<n>Our benchmark covers updates of various types to 54 functions from seven diverse Python packages.
arXiv Detail & Related papers (2024-07-08T17:55:04Z)
Exploring the Capabilities of LLMs for Code Change Related Tasks [14.261870410238643]
Large language models (LLMs) have shown their effectiveness in code-related tasks. LLMs focus on general code syntax and semantics rather than the differences between two code versions. We conduct an empirical study using textgreater 1B parameters LLMs on three code-change-related tasks.
arXiv Detail & Related papers (2024-07-03T05:49:18Z)
VersiCode: Towards Version-controllable Code Generation [58.82709231906735]
Large Language Models (LLMs) have made tremendous strides in code generation, but existing research fails to account for the dynamic nature of software development. We propose two novel tasks aimed at bridging this gap: version-specific code completion (VSCC) and version-aware code migration (VACM) We conduct an extensive evaluation on VersiCode, which reveals that version-controllable code generation is indeed a significant challenge.
arXiv Detail & Related papers (2024-06-11T16:15:06Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
InstructCoder: Instruction Tuning Large Language Models for Code Editing [26.160498475809266]
We explore the use of Large Language Models (LLMs) to edit code based on user instructions. InstructCoder is the first instruction-tuning dataset designed to adapt LLMs for general-purpose code editing. Our findings reveal that open-source LLMs fine-tuned on InstructCoder can significantly enhance the accuracy of code edits.
arXiv Detail & Related papers (2023-10-31T10:15:35Z)
CodeT5+: Open Code Large Language Models for Code Understanding and Generation [72.1638273937025]
Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. CodeT5+ is a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks. We extensively evaluate CodeT5+ on over 20 code-related benchmarks in different settings, including zero-shot, finetuning, and instruction-tuning.
arXiv Detail & Related papers (2023-05-13T14:23:07Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.