Fugu-MT 論文翻訳(概要): AI evaluation may bias perceptions: The importance of context in interpreting academic writing

論文の概要: AI evaluation may bias perceptions: The importance of context in interpreting academic writing

arxiv url: http://arxiv.org/abs/2605.26662v1
Date: Tue, 26 May 2026 07:47:58 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-27 17:51:41.738877
Title: AI evaluation may bias perceptions: The importance of context in interpreting academic writing
Title（参考訳）: AI評価は偏見を知覚する:学術的文章の解釈における文脈の重要性
Authors: Shang Wu, Randol Yao,
Abstract要約: 本稿では,評価手法が各国・分野の文脈差を無視している場合,科学文献におけるAI利用推定値の偏りについて検討する。プール化されたベンチマークは、既存のスタイル変化をAI生成のテキストと混同し、LLM以前の出版物でもカントリーフィールドグループ間でかなりの歪みを生じさせる可能性がある。 2025年にこれらの手法を出版物に適用すると、プールされたベンチマークは、特定の国や分野においてAIの使用を過小評価しながら、体系的に過大評価していることが明らかになった。
参考スコア（独自算出の注目度）: 2.5412649391082502
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper examines how estimates of AI use in scientific writing can be biased when evaluation methods ignore contextual differences across countries and fields. Using large-scale data on journal publications from Dimensions, we construct AI-likeness benchmarks based on differences between human-written and LLM-rephrased abstracts. We show that a pooled benchmark may confound pre-existing stylistic variation with AI-generated text, producing substantial distortions across country-field groups even in pre-LLM publications. In contrast, country-field-specific benchmarks attenuate such distortions and provide a more credible baseline for comparison. Applying these methods to publications in 2025 reveals that the pooled benchmark systematically overestimates AI use in certain countries and fields while underestimating it in others. These findings highlight the importance of context-aware measurement for accurate and equitable evaluation of AI use in science.
Abstract（参考訳）: 本稿では,評価手法が各国・分野間の文脈差を無視する場合に,科学文献におけるAI利用推定値の偏りについて検討する。ダイメンジョンズ誌のジャーナル出版物に関する大規模データを用いて,人間による記述とLLMによる要約の違いに基づいて,AIライクなベンチマークを構築した。プールされたベンチマークは、既存のスタイル変化をAI生成テキストと混同し、LLM以前の出版物においても、田園部グループ間でかなりの歪みを生じさせる可能性がある。対照的に、カントリーフィールド固有のベンチマークは、そのような歪みを減らし、比較のためのより信頼性の高いベースラインを提供する。 2025年にこれらの手法を出版物に適用すると、プールされたベンチマークは、特定の国や分野においてAIの使用を過小評価しながら、体系的に過大評価していることが明らかになった。これらの知見は、科学におけるAI使用の正確かつ公平な評価のための文脈認識測定の重要性を強調している。

論文の概要: AI evaluation may bias perceptions: The importance of context in interpreting academic writing

関連論文リスト