Fugu-MT 論文翻訳(概要): Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

論文の概要: Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

arxiv url: http://arxiv.org/abs/2605.28042v1
Date: Wed, 27 May 2026 06:46:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-28 17:38:55.817755
Title: Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts
Title（参考訳）: 攻撃的プルーニング専門家によるLSMからの小翻訳スペシャリストの抽出
Authors: Liu O. Martin, Lucas Bandarkar, Nanyun Peng,
Abstract要約: 現代の大規模言語モデル(LLM)は最先端の機械翻訳性能を達成する。彼らは、翻訳とは無関係な多くのタスクと能力のために訓練された幅広い一般主義者である。本稿では, 翻訳品質の劣化を招きつつ, 現代のLLMから専門家を積極的に刈り取る手法を提案する。
参考スコア（独自算出の注目度）: 41.27464926788608
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Modern large language models (LLMs) achieve state-of-the-art machine translation performance, but they do so as broad generalists largely trained for many tasks and capabilities unrelated to translation. Thus, they are heavily overparameterized for this task, resulting in excessive memory and compute requirements. In this paper, we present a method for aggressively pruning experts from modern mixture-of-experts LLMs while incurring negligible degradation in translation quality. Our approach exploits expert specialization and the separability of multilingual capabilities in LLMs to identify experts irrelevant to translation. And because of the modular nature of MoEs, these can be easily pruned without any training. Without retraining, we are able to prune half of all experts with negligible degradation and 70% with only minor losses. With a very short SFT, we prune 75% of experts while recovering baseline performance, and in some settings remove nearly 90% while maintaining reasonable translation quality. Overall, our results show that translation requires only a fraction of the LLM, enabling substantial compression of the MoE blocks that contain over 90% of parameters.
Abstract（参考訳）: 現代の大規模言語モデル(LLM)は最先端の機械翻訳性能を達成しているが、翻訳とは無関係な多くのタスクや能力のために訓練された幅広い一般主義者である。したがって、それらはこのタスクに対して非常に過度にパラメータ化され、結果として過剰なメモリと計算要求が生じる。本稿では, 翻訳品質の劣化を招きながら, 現代のLLMから専門家を積極的に刈り取る手法を提案する。本手法は,LLMにおける専門家の専門化と多言語能力の分離性を利用して,翻訳に無関係な専門家を同定する。そして、MoEsのモジュラー性のため、トレーニングなしで簡単に刈り取ることができる。再トレーニングなしでは、無視できない劣化と70%の損失しか与えない専門家の半数を訓練することができます。非常に短いSFTで、ベースラインのパフォーマンスを回復しながら、75%のエキスパートをプルークし、ある設定では、適切な翻訳品質を維持しながら90%近くを削除します。以上の結果から,LLMの変換に要する割合はごくわずかであり,90%以上のパラメータを含むMoEブロックの相当な圧縮が可能であることが示唆された。

論文の概要: Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

関連論文リスト