Fugu-MT 論文翻訳(概要): Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning

論文の概要: Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning

arxiv url: http://arxiv.org/abs/2605.00433v1
Date: Fri, 01 May 2026 06:10:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-04 17:43:28.866375
Title: Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning
Title（参考訳）: 要求に配慮したカリキュラム強化学習によるLLMコード生成の改善
Authors: Shouyu Yin, Zhao Tian, Junjie Chen, Shikai Guo,
Abstract要約: 本稿では,大規模言語モデル(LLM)に基づくコード生成の強化を目的とした,要求対応のカリキュラム強化学習フレームワークを提案する。本稿では,RECRLがすべての最先端ベースラインに対して平均1.23%-5.62%のPass@1改善を実現していることを示す。
参考スコア（独自算出の注目度）: 9.407248347872931
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Code generation, which aims to automatically generate source code from given programming requirements, has the potential to substantially improve software development efficiency. With the rapid advancement of large language models (LLMs), LLM-based code generation has attracted widespread attention from both academia and industry. However, as programming requirements become increasingly complex, existing LLMs still exhibit notable performance limitations. To address this challenge, recent studies have proposed training-based curriculum reinforcement learning (CRL) strategies to improve LLM code generation performance. Despite their effectiveness, existing CRL approaches suffer from several limitations, including misaligned requirement difficulty perception, the absence of requirement difficulty optimization, and suboptimal curriculum sampling strategies. In CRL-based code generation, programming requirements serve as the sole input to the model, making their quality and difficulty critical to training effectiveness. Motivated by insights from software requirements engineering, we propose RECRL, a novel requirement-aware curriculum reinforcement learning framework for enhancing LLM-based code generation. RECRL automatically perceives model-specific requirement difficulty, optimizes challenging requirements to improve training data utilization, and employs an adaptive curriculum sampling strategy to construct training batches with smoothly varying difficulty. Extensive experiments on five state-of-the-art LLMs across five widely-used code generation benchmarks by comparing with five state-of-the-art baselines, demonstrate the significant effectiveness of RECRL. For example, RECRL achieves an average Pass@1 improvement of 1.23%-5.62% over all state-of-the-art baselines.
Abstract（参考訳）: 与えられたプログラミング要件からソースコードを自動的に生成することを目的としたコード生成は、ソフトウェア開発の効率を大幅に改善する可能性がある。大規模言語モデル(LLM)の急速な進歩により、LLMベースのコード生成は学術と産業の両方から広く注目を集めている。しかし、プログラミングの要件がますます複雑化するにつれて、既存のLLMは依然として顕著な性能制限を呈している。この課題に対処するために、近年の研究では、LLMコード生成性能を改善するための訓練ベースのカリキュラム強化学習(CRL)戦略を提案している。その効果にもかかわらず、既存のCRLアプローチには、不整合要求難易度認識、要求難易度最適化の欠如、最適以下のカリキュラムサンプリング戦略など、いくつかの制限がある。 CRLベースのコード生成では、プログラミング要件がモデルへの唯一の入力として機能し、その品質と難易度をトレーニングの有効性に欠かせないものにしている。ソフトウェア要件工学の知見を取り入れたRECRLは,LCMベースのコード生成を向上するための,新たな要件対応カリキュラム強化学習フレームワークである。 RECRLは、モデル固有の要件の難しさを自動的に認識し、トレーニングデータ利用を改善するための難易度要件を最適化し、適応型カリキュラムサンプリング戦略を用いて、スムーズに変化の少ないトレーニングバッチを構築する。広く使われている5つのコード生成ベンチマークにおける5つの最先端LCMに関する大規模な実験は、5つの最先端ベースラインと比較し、RECRLの有効性を実証している。例えばRECRLは、すべての最先端ベースラインに対して平均1.23%-5.62%のPass@1改善を実現している。

論文の概要: Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning

関連論文リスト