Fugu-MT 論文翻訳(概要): Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation

論文の概要: Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation

arxiv url: http://arxiv.org/abs/2604.24361v1
Date: Mon, 27 Apr 2026 11:53:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.965128
Title: Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation
Title（参考訳）: 大規模言語モデルにおける文化認識機械翻訳:ベンチマークと検討
Authors: Zekun Yuan, Yangfan Ye, Xiaocheng Feng, Baohang Li, Qichen Hong, Yunfei Lu, Dandan Tu, Bing Qin,
Abstract要約: 大規模言語モデル(LLM)は、一般的な機械翻訳において高い性能を達成しているが、文化に配慮したシナリオにおけるそれらの能力はいまだに理解されていない。そこで我々は,CanMT,CanMTを紹介した。CanMTは,機械翻訳のための,文化的翻訳品質を評価するための理論的に基礎付けられた多次元評価フレームワークである。
参考スコア（独自算出の注目度）: 36.27108860941823
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have achieved strong performance in general machine translation, yet their ability in culture-aware scenarios remains poorly understood. To bridge this gap, we introduce CanMT, a Culture-Aware Novel-Driven Parallel Dataset for Machine Translation, together with a theoretically grounded, multi-dimensional evaluation framework for assessing cultural translation quality. Leveraging CanMT, we systematically evaluate a wide range of LLMs and translation systems under different translation strategy constraints. Our findings reveal substantial performance disparities across models and demonstrate that translation strategies exert a systematic influence on model behavior. Further analysis shows that translation difficulty varies across types of culture-specific items, and that a persistent gap remains between models' recognition of culture-specific knowledge and their ability to correctly operationalize it in translation outputs. In addition, incorporating reference translations is shown to substantially improve evaluation reliability in LLM-as-a-judge, underscoring their essential role in assessing culture-aware translation quality. The corpus and code are available at CanMT.
Abstract（参考訳）: 大規模言語モデル(LLM)は、一般的な機械翻訳において高い性能を達成しているが、文化に配慮したシナリオにおけるそれらの能力はいまだに理解されていない。このギャップを埋めるために、CanMTは、CanMT(Cultural-Aware Novel-Driven Parallel Dataset for Machine Translation)と、理論的に基礎付けられた多次元評価フレームワークを導入し、文化翻訳の質を評価する。 CanMTを活用することで、異なる翻訳戦略制約の下で、広範囲のLLMと翻訳システムを体系的に評価する。本研究は,モデル間の性能格差を顕著に明らかにし,翻訳戦略がモデル行動に体系的な影響を及ぼすことを示した。さらに分析したところ、翻訳難易度は、文化特化項目の種類によって異なり、モデルによる文化特化知識の認識と、翻訳出力でそれを正しく運用する能力の間には、永続的なギャップが残っていることが示されている。さらに, LLM-as-a-judgeにおける基準翻訳の導入により, LLM-as-a-judgeの信頼性が著しく向上し, 文化認識翻訳の質を評価する上で重要な役割を担っていることが明らかとなった。コーパスとコードはCanMTで入手できる。

論文の概要: Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation

関連論文リスト