Fugu-MT 論文翻訳(概要): Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

論文の概要: Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

arxiv url: http://arxiv.org/abs/2604.05593v1
Date: Tue, 07 Apr 2026 08:43:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-08 17:42:09.719636
Title: Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge
Title（参考訳）: ラベル効果:人間とLLM-as-a-Judgeによる信頼度評価における共有ヒューリスティック信頼性
Authors: Xin Sun, Di Wu, Sijing Qin, Isao Echizen, Abdallah El Ali, Saku Sugawara,
Abstract要約: 大規模言語モデル (LLM) は自動評価器 (LLM-as-a-Judge) としてますます使われている。この研究は、LSMによる信頼判断が開示されたソースラベルに偏っていることを示すことによって、信頼性に挑戦する。視線追跡データは、人間が判断の手がかりとしてソースラベルに大きく依存していることを明らかにする。
参考スコア（独自算出の注目度）: 26.31243351014906
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) are increasingly used as automated evaluators (LLM-as-a-Judge). This work challenges its reliability by showing that trust judgments by LLMs are biased by disclosed source labels. Using a counterfactual design, we find that both humans and LLM judges assign higher trust to information labeled as human-authored than to the same content labeled as AI-generated. Eye-tracking data reveal that humans rely heavily on source labels as heuristic cues for judgments. We analyze LLM internal states during judgment. Across label conditions, models allocate denser attention to the label region than the content region, and this label dominance is stronger under Human labels than AI labels, consistent with the human gaze patterns. Besides, decision uncertainty measured by logits is higher under AI labels than Human labels. These results indicate that the source label is a salient heuristic cue for both humans and LLMs. It raises validity concerns for label-sensitive LLM-as-a-Judge evaluation, and we cautiously raise that aligning models with human preferences may propagate human heuristic reliance into models, motivating debiased evaluation and alignment.
Abstract（参考訳）: 大規模言語モデル (LLM) は、自動化評価器 (LLM-as-a-Judge) としてますます使われている。この研究は、LSMによる信頼判断が開示されたソースラベルに偏っていることを示すことによって、信頼性に挑戦する。対物的設計を用いて、人間とLLMの審査員は、AI生成とラベル付けされた同じコンテンツよりも、人間によってラベル付けされた情報に高い信頼を割り当てていることがわかった。視線追跡データによると、人間は判断のためのヒューリスティックな手がかりとして、ソースラベルに大きく依存している。 LLM内部状態を判定中に解析する。ラベル条件全体では、モデルはコンテンツ領域よりもラベル領域に注意を向け、このラベル優位性はAIラベルよりも人間ラベルの方が強く、人間の視線パターンと一致している。さらに、ロジットによって測定される決定の不確実性は、人間ラベルよりもAIラベルの方が高い。これらの結果は,ヒトとLDMの双方にとって,ソースラベルは健全なヒューリスティックキューであることが示唆された。ラベルに敏感なLCM-as-a-Judge評価の妥当性を高めるとともに,人間の嗜好と整合性のあるモデルがモデルへのヒューリスティックな依存を促進し,不偏性評価とアライメントを動機付ける可能性があることを慎重に評価する。

論文の概要: Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

関連論文リスト