Fugu-MT 論文翻訳(概要): MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts

論文の概要: MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts

arxiv url: http://arxiv.org/abs/2508.19268v2
Date: Mon, 08 Sep 2025 08:30:07 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-09 14:07:03.320436
Title: MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts
Title（参考訳）: MultiPL-MoE:ハイブリッド・ミックス・オブ・エクササイズによる大規模言語モデルの多言語拡張
Authors: Qing Wang, Xue Han, Jiahui Wang, Lehao Xing, Qian Hu, Lianlian Zhang, Chao Deng, Junlan Feng,
Abstract要約: MultiPL-MoEはエキスパートとトークンレベルのエキスパートのハイブリッドである。セグメントレベルのMoEは、プログラミング言語の構文構造と文脈パターンをよりよく捉えるために、2つの革新的な設計を取り入れている。
参考スコア（独自算出の注目度）: 56.106778414865126
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite LLMs' excellent code creation capabilities, multilingual code generation remains extremely challenging. To address this, we intent to improve the multi-programming-lingual (MultiPL) performance of the base LLMs while retaining the most popular ones using restricted computational resources. We consider MultiPL to be a special case of multiple natural languages and propose a MultiPL extension of LLMs utilizing a hybrid mixture of experts (MoE), called MultiPL-MoE. Specifically, MultiPL-MoE combines two paired MoEs to optimize expert selection at both the token and segment levels. The token-level MoE is a standard upcycling MoE structure with a shared expert and a novel gate weight normalization approach that aids in the final fusion with the segment-level MoE. The segment-level MoE incorporates two innovative designs to better capture the syntactic structure and contextual patterns of programming languages: First, using a sliding window to partition the input token sequence into multiple segments; Then, adopting an expert-choice routing strategy that allows experts to select the top-k segments. The results of the experiment proved the effectiveness of MultiPL-MoE.
Abstract（参考訳）: LLMの優れたコード生成機能にもかかわらず、多言語コード生成は非常に難しい。そこで本研究では,LLMのマルチプログラミング言語(MultiPL)性能を向上させるとともに,制限された計算資源を用いて最もよく使われるものを維持することを目的としている。我々は、MultiPLを複数の自然言語の特殊な場合とみなし、MultiPL-MoE(MultiPL-MoE)と呼ばれる、MultiPL-MoE(MultiPL-MoE)を併用したLLMのMultiPL拡張を提案する。特に、MultiPL-MoEは2組のMoEを組み合わせてトークンとセグメントレベルの専門家選択を最適化する。トークンレベル MoE は、共有専門家と、セグメントレベル MoE との最終的な融合を支援する新しいゲートウェイト正規化アプローチを備えた標準のアップサイクル MoE 構造である。まず、スライディングウィンドウを使用して入力トークンシーケンスを複数のセグメントに分割する。次に、専門家がトップkセグメントを選択するためのエキスパート選択ルーティング戦略を採用する。実験の結果,MultiPL-MoEの有効性が示された。

論文の概要: MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts

関連論文リスト