Fugu-MT 論文翻訳(概要): What do the metrics mean? A critical analysis of the use of Automated Evaluation Metrics in Interpreting

論文の概要: What do the metrics mean? A critical analysis of the use of Automated Evaluation Metrics in Interpreting

arxiv url: http://arxiv.org/abs/2601.05864v1
Date: Fri, 09 Jan 2026 15:39:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-12 17:41:50.011217
Title: What do the metrics mean? A critical analysis of the use of Automated Evaluation Metrics in Interpreting
Title（参考訳）: メトリクスとは何か? 解釈における自動評価メトリクスの使用に関する批判的分析
Authors: Jonathan Downie, Joss Moorkens,
Abstract要約: 現在、納品された解釈の質を迅速かつ効率的に測定する方法に対する高い需要がある。本稿では,最近提案されたこれらの品質測定手法について検討し,実際の解釈の実践の質を評価するための妥当性について論じる。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the growth of interpreting technologies, from remote interpreting and Computer-Aided Interpreting to automated speech translation and interpreting avatars, there is now a high demand for ways to quickly and efficiently measure the quality of any interpreting delivered. A range of approaches to fulfil the need for quick and efficient quality measurement have been proposed, each involving some measure of automation. This article examines these recently-proposed quality measurement methods and will discuss their suitability for measuring the quality of authentic interpreting practice, whether delivered by humans or machines, concluding that automatic metrics as currently proposed cannot take into account the communicative context and thus are not viable measures of the quality of any interpreting provision when used on their own. Across all attempts to measure or even categorise quality in Interpreting Studies, the contexts in which interpreting takes place have become fundamental to the final analysis.
Abstract（参考訳）: 遠隔解釈やコンピュータ支援による解釈から自動音声翻訳、アバターの解釈に至るまで、解釈技術の発展に伴い、提供された解釈の品質を迅速かつ効率的に測定する方法の需要が高まっている。迅速かつ効率的な品質測定の必要性を満たすための様々なアプローチが提案されている。本稿では、最近提案されたこれらの品質測定方法について検討し、人間や機械が提供した真正解釈の実践の質を評価するための適合性について論じる。解釈研究における質の測定や分類の試み全般において、解釈が行われる文脈は最終分析の基礎となっている。

関連論文リスト

Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering [68.3400058037817]
本稿では,TREQA(Translation Evaluation via Question-Answering)について紹介する。我々は,TREQAが最先端のニューラルネットワークとLLMベースのメトリクスより優れていることを示し,代用段落レベルの翻訳をランク付けする。
論文参考訳（メタデータ） (2025-04-10T09:24:54Z)
Contextual Metric Meta-Evaluation by Measuring Local Metric Accuracy [52.261323452286554]
本稿では,評価指標の局所的メートル法精度を比較することによって,文脈的メタ評価手法を提案する。翻訳,音声認識,ランキングタスクを通じて,局所的計量精度が絶対値と相対的有効性の両方で異なることを示す。
論文参考訳（メタデータ） (2025-03-25T16:42:25Z)
A Measure of the System Dependence of Automated Metrics [9.594167080604207]
メトリクスがすべてのシステムを公平かつ一貫して扱うようにすることが、同じように重要である、と私たちは主張する。本稿では,この側面を評価する手法を提案する。
論文参考訳（メタデータ） (2024-12-04T09:21:46Z)
Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural Machine Translation [1.6982207802596105]
本研究では,自動計測と人的評価の収束と相違について検討する。自動評価を行うには,DQF-MQMのエラータイプと6つのルーリックを人間の評価に組み込んだ4つの自動計測手法を用いる。その結果、高度な翻訳ツールの性能を評価する上で、人間の判断が不可欠であることが示された。
論文参考訳（メタデータ） (2024-01-10T14:20:33Z)
The Glass Ceiling of Automatic Evaluation in Natural Language Generation [60.59732704936083]
ステップバックして、既存の自動メトリクスと人的メトリクスのボディを比較して、最近の進歩を分析します。古いものや新しいものといった自動メトリクスは、人間よりもずっと似ています。
論文参考訳（メタデータ） (2022-08-31T01:13:46Z)
Measuring Uncertainty in Translation Quality Evaluation (TQE) [62.997667081978825]
本研究は,翻訳テキストのサンプルサイズに応じて,信頼区間を精度良く推定する動機づけた研究を行う。我々はベルヌーイ統計分布モデリング (BSDM) とモンテカルロサンプリング分析 (MCSA) の手法を適用した。
論文参考訳（メタデータ） (2021-11-15T12:09:08Z)
Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods [9.210509295803243]
手動判定基準と自動評価指標の両方を含む、翻訳品質評価(TQA)手法のハイレベルで簡潔な調査を紹介します。翻訳モデル研究者と品質評価研究者の両方にとって、この研究が資産になることを願っています。
論文参考訳（メタデータ） (2021-05-05T18:28:10Z)
GO FIGURE: A Meta Evaluation of Factuality in Summarization [131.1087461486504]
本稿では,現実性評価指標を評価するメタ評価フレームワークGO FIGUREを紹介する。 10個の実測値のベンチマーク分析により、我々のフレームワークが堅牢で効率的な評価を提供することが明らかとなった。また、QAメトリクスは、ドメイン間の事実性を測定する標準的なメトリクスよりも一般的に改善されているが、パフォーマンスは、質問を生成する方法に大きく依存していることも明らかにしている。
論文参考訳（メタデータ） (2020-10-24T08:30:20Z)
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics [64.88815792555451]
評価法は, 評価に用いる翻訳に非常に敏感であることを示す。本研究では,人的判断に対する自動評価基準の下で,性能改善をしきい値にする方法を開発した。
論文参考訳（メタデータ） (2020-06-11T09:12:53Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。