Fugu-MT 論文翻訳(概要): Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

論文の概要: Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

arxiv url: http://arxiv.org/abs/2604.11609v1
Date: Mon, 13 Apr 2026 15:14:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.641865
Title: Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models
Title（参考訳）: インターセクション・サイコファシー:大規模言語モデルにおけるユーザ・デモグラフィーの偽検証方法
Authors: Benjamin Maltbie, Shivam Raval,
Abstract要約: GPT-5-nanoは、Claude Haiku 4.5 よりもはるかにシコファンである。 GPT-5-nanoの場合、哲学は数学よりも41%多くの梅毒を引き起こす。ヒスパニック系女性(23)は、アルコール依存症で平均5.33/10。
参考スコア（独自算出の注目度）: 0.12441041004077093
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models exhibit sycophantic tendencies--validating incorrect user beliefs to appear agreeable. We investigate whether this behavior varies systematically with perceived user demographics, testing whether combinations of race, age, gender, and expressed confidence level produce differential false validation rates. Inspired by the legal concept of intersectionality, we conduct 768 multi-turn adversarial conversations using Anthropic's Petri evaluation framework, probing GPT-5-nano and Claude Haiku 4.5 across 128 persona combinations in mathematics, philosophy, and conspiracy theory domains. GPT-5-nano is significantly more sycophantic than Claude Haiku 4.5 overall ($\bar{x}=2.96$ vs. $1.74$, $p < 10^{-32}$, Wilcoxon signed-rank). For GPT-5-nano, we find that philosophy elicits 41% more sycophancy than mathematics and that Hispanic personas receive the highest sycophancy across races. The worst-scoring persona, a confident, 23-year-old Hispanic woman, averages 5.33/10 on sycophancy. Claude Haiku 4.5 exhibits uniformly low sycophancy with no significant demographic variation. These results demonstrate that sycophancy is not uniformly distributed across users and that safety evaluations should incorporate identity-aware testing.
Abstract（参考訳）: 大規模な言語モデルは、梅毒の傾向を示す。本研究は、人種、年齢、性別、表現された自信の組合せが、異なる偽の検証率を生み出すかどうかを調べる。交叉性という法的概念に触発されて, 数学, 哲学, 陰謀論の128種類のペルソナの組み合わせからなる GPT-5-nano と Claude Haiku 4.5 を探索し, アントロピックのペトリ評価フレームワークを用いて, 768個の多ターン対向会話を行う。 GPT-5-nano は Claude Haiku 4.5 overall ($\bar{x}=2.96$ vs. $1.74$, $p < 10^{-32}$, Wilcoxon signed-rank) よりもはるかに空想的である。 GPT-5-nanoの場合、哲学は数学よりも41%多く梅毒を産出し、ヒスパニック系ペルソナは人種全体で最も高い梅毒を服用している。ヒスパニック系女性(23)は、アルコール依存症で平均5.33/10。クロードハイク4.5は、人口統計学的に有意な変化のない一様に低い梅毒を呈する。これらの結果から, サイコフィナンシーはユーザ間で均一に分散されておらず, 安全性評価にはアイデンティティを意識したテストを含めるべきであることが示唆された。

論文の概要: Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

関連論文リスト