Fugu-MT 論文翻訳(概要): CODEBLOCK: Learning to Supervise Code at the Right Granularity

論文の概要: CODEBLOCK: Learning to Supervise Code at the Right Granularity

arxiv url: http://arxiv.org/abs/2606.18286v1
Date: Wed, 10 Jun 2026 04:46:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-18 17:16:50.791791
Title: CODEBLOCK: Learning to Supervise Code at the Right Granularity
Title（参考訳）: CODEBLOCK: コードを正しい粒度で監視する学習
Authors: Zhijie Deng, Ling Li, Jinlong Pang, Kaiqin Hu, Qi Xuan, Zhaowei Zhu, Jiaheng Wei,
Abstract要約: 孤立トークンではなく,構造完備なコードエビデンスを選択する構造対応スパース監視フレームワークを提案する。実験の結果、CodeBlockは本格的なSFTと競争力のある選択ベースラインよりも平均パス@1を達成している。
参考スコア（独自算出の注目度）: 32.949996770189834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Supervised fine-tuning of code LLMs typically applies uniform cross-entropy loss to all response tokens, implicitly assuming that every token provides equally useful learning signal. Recent token-level selection methods challenge this assumption in natural-language SFT by supervising only high-value tokens. However, directly transferring token-level masking to code can break syntactically and semantically coherent program units, because code depends on structural completeness and definition-use relations. We therefore propose CodeBlock, a structure-aware sparse supervision framework that selects structure-complete code evidence rather than isolated tokens. CodeBlock first selects high-quality instruction-response pairs, then partitions code responses into syntactically coherent coding items, estimates their utility by aggregating generalized cross-entropy over core logic tokens, and reranks them with data-flow reach and bridge signals to prioritize blocks that propagate or connect important program dependencies. During training, the full response remains available as context, while loss is applied only to selected code items and informative natural-language tokens. Experiments on six code-generation benchmarks show that CodeBlock achieves stronger average pass@1 than full-token SFT and competitive selection baselines, while using only 1.9% of supervised response tokens.
Abstract（参考訳）: コードLLMの監督された微調整は、通常全ての応答トークンに均一なクロスエントロピー損失を適用し、全てのトークンが同様に有用な学習信号を提供すると暗黙的に仮定する。最近のトークンレベル選択法は、高価値トークンのみを監督することで、自然言語SFTにおけるこの仮定に挑戦している。しかし、トークンレベルのマスキングを直接コードに転送することは、構造的完全性と定義的使用関係に依存するため、構文的かつ意味論的に一貫性のあるプログラムユニットを壊す可能性がある。そこで我々は,独立したトークンではなく,構造完備なコードエビデンスを選択する構造対応スパース監視フレームワークであるCodeBlockを提案する。 CodeBlockは、まず高品質な命令応答ペアを選択し、その後、コードレスポンスを構文的に一貫性のあるコーディングアイテムに分割し、コアロジックトークン上の一般化されたクロスエントロピーを集約してそれらのユーティリティを推定し、それらをデータフローリーチとブリッジ信号で再ロードして、重要なプログラム依存関係を伝播または接続するブロックを優先順位付けする。トレーニング中も、完全なレスポンスはコンテキストとして利用可能であり、損失は選択されたコード項目とインフォメーションな自然言語トークンにのみ適用される。 6つのコード生成ベンチマークの実験では、CodeBlockは完全なSFTと競合する選択ベースラインよりも平均パス@1を達成する一方で、教師付き応答トークンの1.9%しか使用していない。

論文の概要: CODEBLOCK: Learning to Supervise Code at the Right Granularity

関連論文リスト