Fugu-MT 論文翻訳(概要): To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

論文の概要: To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

arxiv url: http://arxiv.org/abs/2604.27296v1
Date: Thu, 30 Apr 2026 01:14:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:53.860157
Title: To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing
Title（参考訳）: ディフ・アンド・ディフ : LLMに基づく効率的なコード編集のための構造認識および適応出力フォーマット
Authors: Wei Cheng, Yongchang Cao, Chen Shen, Binhua Li, Jue Chen, Yongbin Li, Wei Hu,
Abstract要約: BlockDiffとFuncDiffは構造対応のdiffフォーマットで、変更を構文的に一貫性のあるユニットのブロックレベルの書き換えとして表現する。 AdaEdit は LLM に与えられたdiff フォーマットと全コードの間で最もトークン効率の良いフォーマットを動的に選択するよう訓練する一般的な適応編集戦略である。
参考スコア（独自算出の注目度）: 54.399335188667045
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly used for code editing, yet the prevalent full-code generation paradigm suffers from severe efficiency bottlenecks, posing challenges for interactive coding assistants that demand low latency and cost. Despite the predominant focus on scaling model capabilities, the edit format itself has been largely overlooked in model training. In this paper, we begin with a systematic study of conventional diff formats and reveal that fragile offsets and fragmented hunks make generation highly unnatural for LLMs. To address it, we introduce BlockDiff and FuncDiff, two structure-aware diff formats that represent changes as block-level rewrites of syntactically coherent units such as control structures and functions. Furthermore, we propose AdaEdit, a general adaptive edit strategy that trains LLMs to dynamically choose the most token-efficient format between a given diff format and full code. Extensive experiments demonstrate that AdaEdit paired with structure-aware diff formats consistently matches the accuracy of full-code generation, while reducing both latency and cost by over 30% on long-code editing tasks.
Abstract（参考訳）: 大きな言語モデル(LLM)は、コード編集にますます使われていますが、一般的なフルコード生成パラダイムは、大幅な効率のボトルネックに悩まされ、低レイテンシとコストを必要とする対話型コーディングアシスタントの課題を引き起こします。モデル機能のスケーリングに重点を置いているにも関わらず、編集フォーマット自体は、モデルトレーニングではほとんど見過ごされてきました。本稿では、従来の差分形式を体系的に研究し、脆弱なオフセットと断片化されたハンクがLLMに非常に不自然な生成をもたらすことを明らかにする。これを解決するために,BlockDiff と FuncDiff という2つの構造対応diff フォーマットを導入し,変化を制御構造や関数などの構文的に整合したユニットのブロックレベルの書き直しとして表現する。さらに,あるdiffフォーマットと全コードの間で最もトークン効率の良いフォーマットを動的に選択するようLLMを訓練する汎用的な適応編集手法であるAdaEditを提案する。大規模な実験では、AdaEditと構造対応のdiffフォーマットを組み合わせると、フルコード生成の精度は一貫して一致し、長時間コード編集タスクのレイテンシとコストは30%以上削減されている。

論文の概要: To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

関連論文リスト