Fugu-MT 論文翻訳(概要): ReCode: Updating Code API Knowledge with Reinforcement Learning

論文の概要: ReCode: Updating Code API Knowledge with Reinforcement Learning

arxiv url: http://arxiv.org/abs/2506.20495v1
Date: Wed, 25 Jun 2025 14:41:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-26 21:00:42.794583
Title: ReCode: Updating Code API Knowledge with Reinforcement Learning
Title（参考訳）: ReCode: 強化学習によるコードAPI知識の更新
Authors: Haoze Wu, Yunzhi Yao, Wenhao Yu, Huajun Chen, Ningyu Zhang,
Abstract要約: 大規模言語モデル(LLM)は、外部ライブラリAPIの頻繁な更新に適応する際には、優れたコード生成機能を示す。 ReCodeは,APIの変更に対するプログラマの適応を模倣する新しいフレームワークである。我々の実験は、ReCodeが動的APIシナリオにおけるLLMのコード生成性能を大幅に向上させることを示した。
参考スコア（独自算出の注目度）: 45.077641074621816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their training data, even with access to current documentation, impedes reliable code generation in dynamic environments. To tackle this issue, we propose ReCode (rule-based Reinforcement learning for Code Update), a novel framework that mimics human programmer adaptation to API changes. Specifically, we construct a dataset of approximately 2,000 data entries to train the LLMs to perform version migration based on updated information. Then, we introduce a modified string similarity metric for code evaluation as the reward for reinforcement learning. Our experiments demonstrate that ReCode substantially boosts LLMs' code generation performance in dynamic API scenarios, especially on the unseen CodeUpdateArena task. Crucially, compared to supervised fine-tuning, ReCode has less impact on LLMs' general code generation abilities. We apply ReCode on various LLMs and reinforcement learning algorithms (GRPO and DAPO), all achieving consistent improvements. Notably, after training, Qwen2.5-Coder-7B outperforms that of the 32B parameter code instruction-tuned model and the reasoning model with the same architecture. Code is available at https://github.com/zjunlp/ReCode.
Abstract（参考訳）: 大規模言語モデル(LLM)は、外部ライブラリAPIの頻繁な更新に適応する際には、優れたコード生成機能を示す。このクリティカルな制限は、トレーニングデータから時代遅れのAPI知識に依存することから生じるもので、現在のドキュメントにアクセスしても、動的環境における信頼性の高いコード生成を妨げる。この問題に対処するため,ReCode (ルールベースのReinforcement Learning for Code Update) を提案する。具体的には、約2000のデータエントリからなるデータセットを構築し、更新情報に基づいてバージョンマイグレーションを実行するようにLLMを訓練する。次に、強化学習の報酬として、コード評価のための改良された文字列類似度指標を導入する。我々の実験では、ReCodeは、特に目に見えないCodeUpdateArenaタスクにおいて、動的APIシナリオにおけるLLMのコード生成性能を大幅に向上することを示した。重要なのは、教師付き微調整と比較すると、ReCodeはLLMの一般的なコード生成能力にはあまり影響しない。各種LLMおよび強化学習アルゴリズム(GRPOとDAPO)にReCodeを適用し,一貫した改善を実現する。トレーニング後、Qwen2.5-Coder-7Bは32Bパラメータコードチューニングモデルと同じアーキテクチャの推論モデルよりも優れていた。コードはhttps://github.com/zjunlp/ReCode.comで入手できる。

論文の概要: ReCode: Updating Code API Knowledge with Reinforcement Learning

関連論文リスト