Fugu-MT 論文翻訳(概要): Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation

論文の概要: Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation

arxiv url: http://arxiv.org/abs/2603.13891v1
Date: Sat, 14 Mar 2026 10:58:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.469862
Title: Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation
Title（参考訳）: テキストアノテーションに用いる大言語モデルによるラシアルステレオタイプの再生成
Authors: Petter Törnberg,
Abstract要約: テキストに微妙なアイデンティティの手がかりが、人種的ステレオタイプを反映する方法で、体系的に偏見アノテーションの結果に埋め込まれていることが示される。名前に基づく実験では、黒人個人に関連付けられた名前を含むテキストは、19モデル中18モデルより攻撃的であると評価され、19モデル中18モデルよりゴシップが多いと評価されている。アラブ人の名は、対人的評価とともに認知の高揚を招き、全ての4つの少数民族は一貫して、より独学的でないと評価されている。特筆すべき例外は、名前に基づく雇用力であり、微調整は過度に正しく、体系的に少数派の名前の応募者を好んでいるように見える。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly used for automated text annotation in tasks ranging from academic research to content moderation and hiring. Across 19 LLMs and two experiments totaling more than 4 million annotation judgments, we show that subtle identity cues embedded in text systematically bias annotation outcomes in ways that mirror racial stereotypes. In a names-based experiment spanning 39 annotation tasks, texts containing names associated with Black individuals are rated as more aggressive by 18 of 19 models and more gossipy by 18 of 19. Asian names produce a bamboo-ceiling profile: 17 of 19 models rate individuals as more intelligent, while 18 of 19 rate them as less confident and less sociable. Arab names elicit cognitive elevation alongside interpersonal devaluation, and all four minority groups are consistently rated as less self-disciplined. In a matched dialect experiment, the same sentence is judged significantly less professional (all 19 models, mean gap $-0.774$), less indicative of an educated speaker ($-0.688$), more toxic (18/19), and more angry (19/19) when written in African American Vernacular English rather than Standard American English. A notable exception occurs for name-based hireability, where fine-tuning appears to overcorrect, systematically favoring minority-named applicants. These findings suggest that using LLMs as automated annotators can embed socially patterned biases directly into the datasets and measurements that increasingly underpin research, governance, and decision-making.
Abstract（参考訳）: 大規模言語モデル(LLM)は、学術研究からコンテンツモデレーション、雇用に至るまでのタスクにおいて、自動化されたテキストアノテーションにますます使われている。 19のLSMと2つの実験で、400万以上のアノテーション判定が行なわれており、テキストに微妙なアイデンティティの手がかりが、人種的ステレオタイプを反映する方法で、体系的にバイアスアジェクションの結果に埋め込まれていることが示されている。 39のアノテーションタスクにまたがる名前に基づく実験では、黒人個人に関連する名前を含むテキストは、19モデル中18モデルより攻撃的であり、19モデル中18モデルよりゴシピックであると評価されている。 19モデル中17モデルでは個人をより知能に評価し、19モデル中18モデルでは自信が弱く社交性が低いと評価している。アラブ人の名は、対人的評価とともに認知の高揚を招き、全ての4つの少数民族は一貫して、より独学的でないと評価されている。一致した方言実験において、同じ文は、標準アメリカ英語よりもアフリカ系アメリカ人の英語で書かれた場合(すべての19モデル、平均ギャップ$-0.774$)、教育を受けた話者(-0.688$)、より有毒(18/19)、より怒った(19/19)と判断される。特筆すべき例外は、名前に基づく雇用力であり、微調整は過度に正しく、体系的に少数派の名前の応募者を好んでいるように見える。これらの結果は、自動アノテータとしてLLMを使用することで、研究、ガバナンス、意思決定の基盤となっているデータセットや測定に直接、社会的にパターン化されたバイアスを埋め込むことができることを示唆している。

論文の概要: Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation

関連論文リスト