Fugu-MT 論文翻訳(概要): "Be My Cheese?": Assessing Cultural Nuance in Multilingual LLM Translations

論文の概要: "Be My Cheese?": Assessing Cultural Nuance in Multilingual LLM Translations

arxiv url: http://arxiv.org/abs/2509.21577v1
Date: Thu, 25 Sep 2025 20:55:36 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 20:57:54.009771
Title: "Be My Cheese?": Assessing Cultural Nuance in Multilingual LLM Translations
Title（参考訳）: 『Be My Cheese?』:多言語LLM翻訳における文化的ニュアンスの評価
Authors: Madison Van Doren, Cory Holland,
Abstract要約: このパイロットスタディでは、比喩的言語を翻訳する際に、最先端の多言語AIモデルのローカライズ能力について検討する。文化的な適切さと全体的なローカライゼーションの質 - マーケティングやeコマースといった現実世界のアプリケーションにとって重要な要素である。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This pilot study explores the localisation capabilities of state-of-the-art multilingual AI models when translating figurative language, such as idioms and puns, from English into a diverse range of global languages. It expands on existing LLM translation research and industry benchmarks, which emphasise grammatical accuracy and token-level correctness, by focusing on cultural appropriateness and overall localisation quality - critical factors for real-world applications like marketing and e-commerce. To investigate these challenges, this project evaluated a sample of 87 LLM-generated translations of e-commerce marketing emails across 24 regional dialects of 20 languages. Human reviewers fluent in each target language provided quantitative ratings and qualitative feedback on faithfulness to the original's tone, meaning, and intended audience. Findings suggest that, while leading models generally produce grammatically correct translations, culturally nuanced language remains a clear area for improvement, often requiring substantial human refinement. Notably, even high-resource global languages, despite topping industry benchmark leaderboards, frequently mistranslated figurative expressions and wordplay. This work challenges the assumption that data volume is the most reliable predictor of machine translation quality and introduces cultural appropriateness as a key determinant of multilingual LLM performance - an area currently underexplored in existing academic and industry benchmarks. As a proof of concept, this pilot highlights limitations of current multilingual AI systems for real-world localisation use cases. Results of this pilot support the opportunity for expanded research at greater scale to deliver generalisable insights and inform deployment of reliable machine translation workflows in culturally diverse contexts.
Abstract（参考訳）: このパイロットスタディでは、イディオムや句といった図形言語を英語から多種多様なグローバル言語に翻訳する際に、最先端の多言語AIモデルのローカライズ能力について検討する。文化的な適切さと全体的なローカライゼーション品質に焦点を当て、文法的正確さとトークンレベルの正しさを強調した既存のLLM翻訳研究と業界ベンチマークを拡張している。これらの課題を解明するために,20言語24方言を対象に,87のLLMによるeコマースマーケティングメールの翻訳例を評価した。対象言語に精通した人間レビュアーは、原曲のトーン、意味、意図された聴衆に対する、定量的な評価と質的なフィードバックを提供した。先導的なモデルは一般的に文法的に正しい翻訳を生成するが、文化的にニュアンスのある言語は改善のための明確な領域であり、しばしば実質的な人間の洗練を必要としている。特に、業界ベンチマークのリーダーボードを抜いたにもかかわらず、高リソースのグローバル言語でさえ、しばしば図式表現やワードプレイを誤訳している。この研究は、データボリュームが機械翻訳品質の最も信頼性の高い予測因子であり、マルチリンガルLLMパフォーマンスの重要な決定要因として文化的な適切性を導入するという仮定に挑戦する。概念実証として、このパイロットは、実世界のローカライゼーションユースケースに対する、現在の多言語AIシステムの制限を強調している。このパイロットの結果は、より大規模に研究を拡大し、一般的な洞察を提供し、文化的に多様な文脈における信頼性の高い機械翻訳ワークフローの展開を通知する機会をサポートする。

論文の概要: "Be My Cheese?": Assessing Cultural Nuance in Multilingual LLM Translations

関連論文リスト