Fugu-MT 論文翻訳(概要): Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS

論文の概要: Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS

arxiv url: http://arxiv.org/abs/2512.00380v1
Date: Sat, 29 Nov 2025 08:13:54 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-02 19:46:34.207114
Title: Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS
Title（参考訳）: API知識グラフ構築データを用いたフレームワーク対応コード生成:HarmonyOSの検討
Authors: Mingwei Liu, Zheng Pei, Yanlin Wang, Zihao Wang, Zikang Li, Enci Lin, Xin Peng, Zibin Zheng,
Abstract要約: APIKG4SYNはAPI指向の質問コードペアの構築にAPIナレッジグラフを活用するように設計されたフレームワークである。 APIKG4SYNを使ったHarmonyOSコード生成のための最初のベンチマークを構築した。
参考スコア（独自算出の注目度）: 52.483888557864326
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the context of software frameworks with limited resources (such as HarmonyOS), large language models (LLMs) often exhibit poor code generation performance because they lack sufficient exposure to such environments during pre-training. Although LLMs can usually maintain correct logical structures across programming languages, they frequently struggle when dealing with framework-specific APIs or syntax, resulting in errors. This indicates that while pre-training equips LLMs with general algorithmic capabilities, they remain unfamiliar with the distinctive syntax and API usage of underrepresented frameworks. As a result, even advanced commercial models like GPT-4o cannot reliably generate correct code without prior adaptation. To address this issue, we propose APIKG4SYN, a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs, specifically tailored for low-resource frameworks without requiring executable code. APIKG4SYN integrates both single-API and multi-API knowledge, where the latter is derived through uncertainty estimation (UE)-driven Monte Carlo Tree Search (MCTS), enabling the creation of a diverse and informative dataset for fine-tuning LLMs. Using HarmonyOS as a case study, we build the first benchmark for HarmonyOS code generation. Experimental results show that fine-tuning Qwen with APIKG4SYN raises pass@1 accuracy to 25.00%, compared with 17.59% for the baseline GPT model. These results confirm that API-oriented data significantly enhance LLM performance in low-resource software development scenarios.
Abstract（参考訳）: HarmonyOSのような限られたリソースを持つソフトウェアフレームワークの文脈では、大規模な言語モデル(LLM)は、事前トレーニング中にそのような環境に十分な露出がないため、コード生成性能が劣ることが多い。 LLMは通常、プログラミング言語全体の正しい論理構造を維持することができるが、フレームワーク固有のAPIや構文を扱う際にしばしば苦労し、結果としてエラーが発生する。このことは、事前トレーニングがLLMに一般的なアルゴリズム能力を持たせる一方で、表現不足のフレームワークの独特な構文やAPI使用に慣れていないことを示している。その結果、GPT-4oのような先進的な商用モデルでさえ、事前適応なしに正しいコードを確実に生成することはできない。この問題に対処するために,API指向の質問コードペアの構築にAPIナレッジグラフを活用するために設計されたフレームワークであるAPIKG4SYNを提案する。 APIKG4SYNはシングルAPIとマルチAPIの知識を統合しており、後者は不確実性推定(UE)駆動のモンテカルロ木探索(MCTS)によって導出される。 HarmonyOSをケーススタディとして、HarmonyOSコード生成のための最初のベンチマークを構築しました。実験結果から, APIKG4SYN を用いた微調整 Qwen ではパス@1 の精度が 25.00% に向上し,ベースライン GPT モデルでは 17.59% に向上した。これらの結果から,低リソースのソフトウェア開発シナリオにおいて,API指向のデータによりLLMの性能が著しく向上することが確認された。

論文の概要: Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS

関連論文リスト