Fugu-MT 論文翻訳(概要): Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks

論文の概要: Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks

arxiv url: http://arxiv.org/abs/2510.15115v1
Date: Thu, 16 Oct 2025 20:16:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-20 20:17:34.379922
Title: Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks
Title（参考訳）: 多言語知識探索ベンチマークにおける拡散の影響の測定
Authors: Kirill Semenov, Rico Sennrich,
Abstract要約: 我々は,Google Translate と ChatGPT による初期(測定済み)の MLAMA データセットと文レベルの翻訳の知識検索スコアを比較した。我々は,知識検索スコアの大幅な増加を観察し,その背景にある可能性のある理由について質的な分析を行う。また、異なるファミリーからさらに5つの言語を分析して、同様のパターンを確認します。
参考スコア（独自算出の注目度）: 27.561894897347376
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For multilingual factual knowledge assessment of LLMs, benchmarks such as MLAMA use template translations that do not take into account the grammatical and semantic information of the named entities inserted in the sentence. This leads to numerous instances of ungrammaticality or wrong wording of the final prompts, which complicates the interpretation of scores, especially for languages that have a rich morphological inventory. In this work, we sample 4 Slavic languages from the MLAMA dataset and compare the knowledge retrieval scores between the initial (templated) MLAMA dataset and its sentence-level translations made by Google Translate and ChatGPT. We observe a significant increase in knowledge retrieval scores, and provide a qualitative analysis for possible reasons behind it. We also make an additional analysis of 5 more languages from different families and see similar patterns. Therefore, we encourage the community to control the grammaticality of highly multilingual datasets for higher and more interpretable results, which is well approximated by whole sentence translation with neural MT or LLM systems. The dataset and all related code is published at the Github repository: https://github.com/ZurichNLP/Fluent-mLAMA.
Abstract（参考訳）: LLMの多言語的事実知識評価のために、MLAMAのようなベンチマークでは、文に挿入された名前付きエンティティの文法的および意味的な情報を考慮していないテンプレート翻訳を使用している。これは最終的なプロンプトの非文法的あるいは間違った言葉の多くの例をもたらし、特に豊富な形態的在庫を持つ言語において、スコアの解釈を複雑にしている。本研究では、MLAMAデータセットから4つのスラヴ語をサンプリングし、初期(測定済み)のMLAMAデータセットとGoogle TranslateとChatGPTによる文レベル翻訳の知識検索スコアを比較した。我々は,知識検索スコアの大幅な増加を観察し,その背景にある可能性のある理由について質的な分析を行う。また、異なるファミリーからさらに5つの言語を分析して、同様のパターンを確認します。そこで,我々は,高次・高次多言語データセットの文法性を,より高次かつ解釈可能な結果に対して制御することをコミュニティに勧める。データセットと関連するすべてのコードはGithubリポジトリで公開されている。

関連論文リスト

Ready to Translate, Not to Represent? Bias and Performance Gaps in Multilingual LLMs Across Language Families and Domains [6.357124887141297]
大規模言語モデル (LLM) は機械翻訳 (MT) を再定義した LLMは言語家族や専門ドメイン間で不均一なパフォーマンスを示すことが多い。オープンソースLLMの翻訳品質と公平性を評価するための統合フレームワークおよびデータセットであるTranslation Tanglesを紹介する。
論文参考訳（メタデータ） (2025-10-09T07:28:30Z)
Testing the Limits of Machine Translation from One Book [0.0]
現在の最先端モデルは、コンテキスト内学習を活用して、以前は目に見えない言語コンテキストに変換する能力を示している。話者数が多いにも関わらず、最小限のデジタルリソースを持つ言語であるKanuriに焦点を当てる。
論文参考訳（メタデータ） (2025-08-08T19:27:44Z)
LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation [67.24113079928668]
本稿では、バイリンガル辞書に見られる感覚のカバレッジによって駆動されるデータキュレーション手法であるLexMatcherを提案する。我々の手法は、WMT2022テストセットの確立されたベースラインよりも優れています。
論文参考訳（メタデータ） (2024-06-03T15:30:36Z)
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
私たちは122の言語変種にまたがるデータセットであるBelebeleを紹介します。このデータセットは、高、中、低リソース言語におけるテキストモデルの評価を可能にする。
論文参考訳（メタデータ） (2023-08-31T17:43:08Z)
Leveraging Language Identification to Enhance Code-Mixed Text Classification [0.7340017786387767]
既存のディープラーニングモデルは、コード混合テキストの暗黙の言語情報を活用できない。本研究の目的は,低リソースのCode-Mixed Hindi- Englishデータセット上でのBERTモデルの性能向上である。
論文参考訳（メタデータ） (2023-06-08T06:43:10Z)
Adapters for Enhanced Modeling of Multilingual Knowledge and Text [54.02078328453149]
言語モデルは多言語言語モデル(MLLM)に拡張された。知識グラフは、注意深いキュレーションを必要とし、少数の高リソース言語でのみ利用可能である、明示的な三重形式で事実を含む。我々は,MLLMを多言語知識グラフ(MLKG)からの知識で拡張し,言語や知識グラフのタスクに多くの言語で取り組むことを提案する。
論文参考訳（メタデータ） (2022-10-24T21:33:42Z)
Does Transliteration Help Multilingual Language Modeling? [0.0]
多言語言語モデルに対する音訳の効果を実証的に測定する。私たちは、世界で最もスクリプトの多様性が高いIndic言語にフォーカスしています。比較的高いソースコード言語に悪影響を及ぼすことなく、低リソース言語にトランスリテラゼーションが有効であることに気付きました。
論文参考訳（メタデータ） (2022-01-29T05:48:42Z)
Mixed Attention Transformer for LeveragingWord-Level Knowledge to Neural Cross-Lingual Information Retrieval [15.902630454568811]
本稿では,辞書や翻訳表などの外部単語レベルの知識を取り入れた,MAT(Mixed Attention Transformer)を提案する。翻訳知識をアテンションマトリックスに符号化することにより、MATを用いたモデルは、入力シーケンス内の相互翻訳された単語にフォーカスすることができる。
論文参考訳（メタデータ） (2021-09-07T00:33:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。