Fugu-MT 論文翻訳(概要): Outraged AI: Large language models prioritise emotion over cost in fairness enforcement

論文の概要: Outraged AI: Large language models prioritise emotion over cost in fairness enforcement

arxiv url: http://arxiv.org/abs/2510.17880v1
Date: Fri, 17 Oct 2025 08:41:36 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:12.252119
Title: Outraged AI: Large language models prioritise emotion over cost in fairness enforcement
Title（参考訳）: 誇張されたAI: 大規模言語モデルが公正な執行のコストよりも感情を優先する
Authors: Hao Liu, Yiqing Dai, Haotian Tan, Yu Lei, Yujia Zhou, Zhen Wu,
Abstract要約: 我々は,大言語モデル (LLM) が感情を用いて罰を導いていることを示す。不公平はより強い否定的な感情をもたらし、より多くの罰を導いた。将来のモデルでは、人間のような感情的知性を達成するために、感情を文脈に敏感な推論と統合すべきである。
参考スコア（独自算出の注目度）: 13.51400164704227
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Emotions guide human decisions, but whether large language models (LLMs) use emotion similarly remains unknown. We tested this using altruistic third-party punishment, where an observer incurs a personal cost to enforce fairness, a hallmark of human morality and often driven by negative emotion. In a large-scale comparison of 4,068 LLM agents with 1,159 adults across 796,100 decisions, LLMs used emotion to guide punishment, sometimes even more strongly than humans did: Unfairness elicited stronger negative emotion that led to more punishment; punishing unfairness produced more positive emotion than accepting; and critically, prompting self-reports of emotion causally increased punishment. However, mechanisms diverged: LLMs prioritized emotion over cost, enforcing norms in an almost all-or-none manner with reduced cost sensitivity, whereas humans balanced fairness and cost. Notably, reasoning models (o3-mini, DeepSeek-R1) were more cost-sensitive and closer to human behavior than foundation models (GPT-3.5, DeepSeek-V3), yet remained heavily emotion-driven. These findings provide the first causal evidence of emotion-guided moral decisions in LLMs and reveal deficits in cost calibration and nuanced fairness judgements, reminiscent of early-stage human responses. We propose that LLMs progress along a trajectory paralleling human development; future models should integrate emotion with context-sensitive reasoning to achieve human-like emotional intelligence.
Abstract（参考訳）: 感情は人間の決定を導くが、大きな言語モデル(LLM)も同様に感情を使うかどうかは不明だ。我々は、利他的な第三者の罰を用いてこれをテストし、観察者は公正性、人間の道徳の証し、しばしば否定的な感情によって引き起こされる個人的費用を負担した。大規模な比較では、796,100の意思決定で1,159人の成人と4,068人のLSMエージェントを比較し、LLMは罰を導くために感情を使った。しかし、LLMはコストよりも感情を優先し、コストの感度を低下させるのに対して、人間は公平さとコストのバランスを保った。特に、推論モデル(o3-mini、DeepSeek-R1)は基礎モデル(GPT-3.5、DeepSeek-V3)よりもコストに敏感で人間の行動に近いが、感情に強く依存していた。これらの知見は、LLMにおける感情誘導的道徳的決定の因果的証拠として初めて示され、初期の人間の反応を想起させる、コストの調整とニュアンスド・フェアネスの判断の欠陥を明らかにした。今後のモデルでは、人間のような感情的知性を達成するために、感情を文脈に敏感な推論と統合すべきである。

関連論文リスト

Emotionally Charged, Logically Blurred: AI-driven Emotional Framing Impairs Human Fallacy Detection [25.196971926947906]
本稿では、感情的なフレーミングが誤認識や説得力とどのように相互作用するかについて、最初の計算的研究を行う。我々は、大言語モデル(LLM)を用いて、誤った議論において、感情的な魅力を体系的に変化させる。我々の研究は、誤った議論の文脈におけるAIによる感情的な操作に影響を及ぼす。
論文参考訳（メタデータ） (2025-10-09T14:57:37Z)
Large Language Models are Highly Aligned with Human Ratings of Emotional Stimuli [0.62914438169038]
感情は、普通の場所と高いストレスの両方のタスクにおいて、人間の行動と認知に大きな影響を及ぼす。大規模言語モデルは、感情的に負荷された刺激や状況がどのように評価されるかを理解することによって、議論を知らせるべきである。これらのケースにおけるモデルと人間の行動との整合性は、特定の役割や相互作用に対するLLMの有効性を知らせることができる。
論文参考訳（メタデータ） (2025-08-19T19:22:00Z)
UDDETTS: Unifying Discrete and Dimensional Emotions for Controllable Emotional Text-to-Speech [61.989360995528905]
制御可能な感情的TTSのための離散的感情と次元的感情を統一する普遍的なフレームワークであるUDDETTSを提案する。このモデルは、次元的感情記述のための解釈可能なArousal-Dominance-Valence(ADV)空間を導入し、離散的な感情ラベルまたは非線形に定量化されたADV値によって駆動される感情制御をサポートする。実験の結果, UDDETTSは3次元の線形感情制御を実現し, エンドツーエンドの感情音声合成能力に優れていた。
論文参考訳（メタデータ） (2025-05-15T12:57:19Z)
Human-like Affective Cognition in Foundation Models [28.631313772625578]
基礎モデルにおける感情認知テストのための評価フレームワークを提案する。評価,感情,表現,結果の関係を探求する1,280の多様なシナリオを生成する。以上の結果から,基礎モデルは人間の直感と一致しがちであることが明らかとなった。
論文参考訳（メタデータ） (2024-09-18T06:42:13Z)
GPT-4 Emulates Average-Human Emotional Cognition from a Third-Person Perspective [1.642094639107215]
まず最初に、脳神経活動のパターンを見つけるために設計された感情誘発刺激を慎重に構築する。以上の結果から, GPT-4は特に精度が高いことが示唆された。 GPT-4の解釈は,自己評価よりも,他者の感情に対する人間の判断と密接に一致していることがわかった。
論文参考訳（メタデータ） (2024-08-11T01:22:09Z)
The Good, The Bad, and Why: Unveiling Emotions in Generative AI [73.94035652867618]
EmotionPromptはAIモデルの性能を向上し、EmotionAttackはそれを妨げうることを示す。 EmotionDecodeによると、AIモデルは人間の脳内のドーパミンのメカニズムに似た感情的な刺激を理解することができる。
論文参考訳（メタデータ） (2023-12-18T11:19:45Z)
Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion [87.18073195745914]
人間の感情が感情の予測において有意であると考えられる特徴とどのように相関するかを検討する。 EmoTriggerを用いて、感情のトリガーを識別する大規模言語モデルの能力を評価する。分析の結果、感情のトリガーは感情予測モデルにとって健全な特徴ではなく、様々な特徴と感情検出のタスクの間に複雑な相互作用があることが判明した。
論文参考訳（メタデータ） (2023-11-16T06:20:13Z)
Socratis: Are large multimodal models emotionally aware? [63.912414283486555]
既存の感情予測ベンチマークでは、様々な理由で画像やテキストが人間にもたらす感情の多様性を考慮していない。社会反応ベンチマークであるソクラティス (Socratis) を提案し, それぞれのイメージ・キャプション(IC) ペアに複数の感情とそれらを感じる理由をアノテートする。我々は、ICペアが与えられた感情を感じる理由を生成するために、最先端のマルチモーダルな大規模言語モデルの能力をベンチマークする。
論文参考訳（メタデータ） (2023-08-31T13:59:35Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。