Fugu-MT 論文翻訳(概要): No Resource, No Benchmarks, No Problem? Evaluating and Improving LLMs for Code Generation in No-Resource Languages

論文の概要: No Resource, No Benchmarks, No Problem? Evaluating and Improving LLMs for Code Generation in No-Resource Languages

arxiv url: http://arxiv.org/abs/2606.16827v1
Date: Mon, 15 Jun 2026 15:08:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:34.658773
Title: No Resource, No Benchmarks, No Problem? Evaluating and Improving LLMs for Code Generation in No-Resource Languages
Title（参考訳）: リソースなし、ベンチマークなし、問題なし? オープンソース言語におけるコード生成のためのLLMの評価と改善
Authors: Alessandro Giagnorio, Alberto Martin-Lopez, Gabriele Bavota,
Abstract要約: 大規模言語モデル(LLM)は、ソフトウェアエンジニアリングタスクの自動化を進歩させた。この分野のほとんどの研究は、豊富なトレーニングデータから恩恵を受ける、PythonやJavaのようなオープンソースの言語に焦点を当てている。 LLMがほとんど見ていないオープンソース言語では、トレーニングデータはほとんど研究されていない。
参考スコア（独自算出の注目度）: 50.536461876371526
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have significantly advanced the automation of software engineering tasks. One prominent example is code generation, where an LLM produces code in a specified programming language based on a natural language description. Most research in this area has focused on high-resource languages, such as Python or Java, which benefit from abundant training data. A smaller body of work has explored low-resource languages, which are underrepresented in training corpora. In contrast, no-resource languages for which LLMs have seen virtually no training data remain largely unstudied. These languages often emerge in industry, where organizations develop proprietary or domain-specific languages unsupported by commercial tools like GitHub Copilot. This results in the need for companies to deploy their own in-house code recommenders. To investigate possible solutions in this context, we build and release three code generation benchmarks for no-resource languages, based on two recently proposed programming languages for which very little training data is available. Using these benchmarks, we experiment several solutions to teach LLMs about no-resource languages, including prompt-based techniques as well as pre-training and fine-tuning exploiting the little data available. While further pre-training gives the largest performance gains for no-resource languages, applying it directly to instruction-tuned models harms their ability to follow instructions. To address this, we start from a base model, further pre-training it on the target language, and then inject instruction-following capabilities via weight diff transfer from an instruction model. Such an approach significantly improves code generation capabilities in no-resource settings, allowing companies to cheaply deploy a specialized instruct model without dealing with the computational cost of instruction fine-tuning.
Abstract（参考訳）: 大規模言語モデル(LLM)は、ソフトウェアエンジニアリングタスクの自動化を大幅に進歩させた。 1つの顕著な例は、LLMが自然言語記述に基づいて特定のプログラミング言語でコードを生成するコード生成である。この分野のほとんどの研究は、豊富なトレーニングデータから恩恵を受ける、PythonやJavaのようなオープンソースの言語に焦点を当てている。より小さな研究機関は、コーパスのトレーニングにおいて不足している低リソース言語を調査してきた。対照的に、LLMが事実上トレーニングデータを目にしたオープンソース言語はほとんど研究されていない。これらの言語は、GitHub Copilotのような商用ツールによってサポートされていない、プロプライエタリまたはドメイン固有の言語を開発する業界でしばしば現れます。これにより、企業は独自のコードレコメンデータを社内に配置する必要がある。この文脈で可能な解決策を調査するために、我々は、非常に少ないトレーニングデータを持つ2つの最近提案されたプログラミング言語に基づいて、非オープンソース言語のための3つのコード生成ベンチマークを構築し、リリースする。これらのベンチマークを用いて、我々は、プロンプトベースの技術や、利用可能な小さなデータを利用した事前学習や微調整を含む、非オープンソース言語についてLLMを教えるためのいくつかのソリューションを実験した。さらなる事前トレーニングは、非リソース言語で最大のパフォーマンス向上をもたらすが、命令チューニングモデルに直接適用することは、命令に従う能力を損なう。これを解決するために、ベースモデルから始まり、ターゲット言語でさらに事前訓練を行い、次に命令モデルから重み付け差分変換によって命令追従機能を注入する。このようなアプローチは、非リソース環境でのコード生成能力を大幅に改善し、企業は、微調整の計算コストに対処することなく、特別なインストラクションモデルを安価にデプロイできる。

論文の概要: No Resource, No Benchmarks, No Problem? Evaluating and Improving LLMs for Code Generation in No-Resource Languages

関連論文リスト