Fugu-MT 論文翻訳(概要): Testing the Limits of Machine Translation from One Book

論文の概要: Testing the Limits of Machine Translation from One Book

arxiv url: http://arxiv.org/abs/2508.06665v1
Date: Fri, 08 Aug 2025 19:27:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-12 21:23:28.499106
Title: Testing the Limits of Machine Translation from One Book
Title（参考訳）: 1冊の本から機械翻訳の限界をテストする
Authors: Jonathan Shaw, Dillon Mee, Timothy Khouw, Zackary Leech, Daniel Wilson,
Abstract要約: 現在の最先端モデルは、コンテキスト内学習を活用して、以前は目に見えない言語コンテキストに変換する能力を示している。話者数が多いにも関わらず、最小限のデジタルリソースを持つ言語であるKanuriに焦点を当てる。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current state-of-the-art models demonstrate capacity to leverage in-context learning to translate into previously unseen language contexts. Tanzer et al. [2024] utilize language materials (e.g. a grammar) to improve translation quality for Kalamang using large language models (LLMs). We focus on Kanuri, a language that, despite having substantial speaker population, has minimal digital resources. We design two datasets for evaluation: one focused on health and humanitarian terms, and another containing generalized terminology, investigating how domain-specific tasks impact LLM translation quality. By providing different combinations of language resources (grammar, dictionary, and parallel sentences), we measure LLM translation effectiveness, comparing results to native speaker translations and human linguist performance. We evaluate using both automatic metrics and native speaker assessments of fluency and accuracy. Results demonstrate that parallel sentences remain the most effective data source, outperforming other methods in human evaluations and automatic metrics. While incorporating grammar improves over zero-shot translation, it fails as an effective standalone data source. Human evaluations reveal that LLMs achieve accuracy (meaning) more effectively than fluency (grammaticality). These findings suggest LLM translation evaluation benefits from multidimensional assessment beyond simple accuracy metrics, and that grammar alone, without parallel sentences, does not provide sufficient context for effective domain-specific translation.
Abstract（参考訳）: 現在の最先端モデルは、コンテキスト内学習を活用して、以前は目に見えない言語コンテキストに変換する能力を示している。 Tanzerら[2024]は言語素材(例えば文法)を使って、大きな言語モデル(LLM)を使用して、カラマン語の翻訳品質を改善する。話者数が多いにも関わらず、最小限のデジタルリソースを持つ言語であるKanuriに焦点を当てる。評価のための2つのデータセットを設計する。1つは健康と人道的な用語に焦点を当て、もう1つは一般化された用語を含んでおり、ドメイン固有のタスクがLLM翻訳品質にどのように影響するかを調査している。言語資源の異なる組み合わせ(文法,辞書,並列文)を提供することで,LLM翻訳の有効性を測定し,母語話者翻訳と人間の言語学者のパフォーマンスを比較した。自動測定とネイティブ話者による流速と精度の評価の両方を用いて評価を行った。その結果、並列文は最も効果的なデータソースであり、人間の評価や自動メトリクスの他の手法よりも優れていることが示された。文法の導入はゼロショット翻訳よりも改善されるが、有効なスタンドアロンデータソースとして失敗する。人間の評価により、LLMは流速(文法)よりも精度(意味)が高いことが分かる。これらの結果から, LLM翻訳評価は, 単純な精度測定以上の多次元的評価の恩恵を受けており, 文法だけでは, パラレル文なしでは, ドメイン固有翻訳に十分な文脈が得られていないことが示唆された。

論文の概要: Testing the Limits of Machine Translation from One Book

関連論文リスト