Fugu-MT 論文翻訳(概要): "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

論文の概要: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

arxiv url: http://arxiv.org/abs/2305.09941v3
Date: Tue, 30 May 2023 05:21:34 GMT
ステータス: 翻訳完了
システム内更新日: 2023-06-01 00:29:57.038671
Title: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Title（参考訳）: 『i'm full who i am』 : オープン言語生成におけるバイアスを測定するためにトランスジェンダーとノンバイナリの声を中心に
Authors: Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
Abstract要約: トランスジェンダーとノンバイナリ(TGNB)の個人は、日常生活から差別や排除を不当に経験している。我々は、オープン言語生成(OLG)において、TGNB人による疎外化を取り巻く社会的現実がいかに貢献し、持続するかを評価する。我々は,TGNB指向のコミュニティ内で,現実のテキストからキュレートされたテンプレートベースのテキストからなるTANGOデータセットを紹介する。
参考スコア（独自算出の注目度）: 69.25368160338043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization by TGNB persons contributes to and persists within Open Language Generation (OLG). By first understanding their marginalization stressors, we evaluate (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce the TANGO dataset, comprising of template-based text curated from real-world text within a TGNB-oriented community. We discover a dominance of binary gender norms within the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, LLM text contained stigmatizing language and scored most toxic when triggered by TGNB gender disclosure. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.
Abstract（参考訳）: トランスジェンダーとノンバイナリ(TGNB)の個人は、日常生活から差別や排除を不当に経験している。近年の言語生成技術の普及と普及を考えると、この人口のさらなる疎外化の可能性は増大するのみである。 NLPフェアネスの文献は、性別バイアスの照明と対処に焦点を当てているが、TGNBのアイデンティティに対する性別の害を評価するには、そのようなアイデンティティが社会的性規範とどのように一意に相互作用するか、そしてそれらがジェンダーバイナリ中心の視点とどのように異なるかを理解する必要がある。このような測定フレームワークは本質的には、ジェンダー非包摂的NLPと彼らが誰に仕えるかの調整を支援するために、中心的なTGNB音声を必要とする。この目標に向けて、我々はTGNBのコミュニティと既存の学際文献を基盤として、TGNBの人々が経験した限界化を取り巻く社会的現実がオープン言語生成(OLG)にどのように貢献し、持続するかを評価する。まず, 限界化ストレス因子をまず理解することにより, 1) 性別の誤認と(2) 性開示に対する有害な反応を評価する。そこで本研究では,TGNB 指向のコミュニティ内で,現実のテキストからキュレートされたテンプレートベースのテキストからなる TANGO データセットを提案する。モデル内では二項代名詞が支配的であり,二項代名詞を用いたプロンプトをきっかけに,LLMは生成したテキストの中で最少の男女が生成される。一方,singular theyとneopronounsで発生をトリガーする場合,ミスジェネレーションが最も一般的であった。 LLMのテキストには、性別の開示をきっかけに、スティグマティゼーション言語が含まれ、TGNBの性別の開示によって最も有毒になった。我々の研究は、TLMにおけるTGNBの有害性に関するさらなる研究を保証し、コミュニティ音声や学際文学におけるジェンダー非包括的AIの設計を具体化するための幅広いケーススタディとして役立っている。

論文の概要: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

関連論文リスト