Fugu-MT 論文翻訳(概要): Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection

論文の概要: Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection

arxiv url: http://arxiv.org/abs/2510.19331v1
Date: Wed, 22 Oct 2025 07:48:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:15.327247
Title: Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection
Title（参考訳）: NLPにおけるアルゴリズムフェアネス:人間中心のHate音声検出のためのペルソナ注入LDM
Authors: Ewelina Gajewska, Arda Derbent, Jaroslaw A Chudziak, Katarzyna Budzynska,
Abstract要約: 本研究では,多言語モデル(Persona-LLM)とアノテータペルソナのパーソナライズが,ヘイトスピーチに対する感受性にどのように影響するかを検討する。我々は,Google の Gemini と OpenAI の GPT-4.1-mini モデルと 2 つのペルソナプロンプト手法を採用している。社会デマトグラフィーの属性をLLMに組み込むことで、ヘイトスピーチの自動検出におけるバイアスに対処できることを示す。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate how personalising Large Language Models (Persona-LLMs) with annotator personas affects their sensitivity to hate speech, particularly regarding biases linked to shared or differing identities between annotators and targets. To this end, we employ Google's Gemini and OpenAI's GPT-4.1-mini models and two persona-prompting methods: shallow persona prompting and a deeply contextualised persona development based on Retrieval-Augmented Generation (RAG) to incorporate richer persona profiles. We analyse the impact of using in-group and out-group annotator personas on the models' detection performance and fairness across diverse social groups. This work bridges psychological insights on group identity with advanced NLP techniques, demonstrating that incorporating socio-demographic attributes into LLMs can address bias in automated hate speech detection. Our results highlight both the potential and limitations of persona-based approaches in reducing bias, offering valuable insights for developing more equitable hate speech detection systems.
Abstract（参考訳）: 本稿では,多言語モデル (Persona-LLMs) とアノテータ・ペルソナ・LLMs) のパーソナライズがヘイトスピーチに対する感受性にどのように影響するかを検討する。この目的のために、我々はGoogleのGeminiとOpenAIのGPT-4.1-miniモデルと2つのペルソナプロンプト手法、すなわち浅いペルソナプロンプトと、よりリッチなペルソナプロファイルを組み込むためにRetrieval-Augmented Generation (RAG)に基づく深く文脈化されたペルソナ開発を採用する。グループ内およびグループ外アノテータ・ペルソナの使用が、様々な社会集団におけるモデルの検出性能と公正性に与える影響を分析した。この研究は、グループアイデンティティに関する心理学的洞察を高度なNLP技術で橋渡しし、社会デコグラフィー属性をLLMに組み込むことで、自動ヘイトスピーチ検出におけるバイアスに対処できることを示した。以上の結果から,より公平なヘイトスピーチ検出システムを開発する上で,ペルソナに基づくバイアス低減手法の可能性と限界の両方を強調した。

論文の概要: Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection

関連論文リスト