Fugu-MT 論文翻訳(概要): "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

論文の概要: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

arxiv url: http://arxiv.org/abs/2305.09941v4
Date: Thu, 1 Jun 2023 20:42:13 GMT
ステータス: 翻訳完了
システム内更新日: 2023-06-05 19:21:55.535732
Title: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Title（参考訳）: 『i'm full who i am』 : オープン言語生成におけるバイアスを測定するためにトランスジェンダーとノンバイナリの声を中心に
Authors: Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
Abstract要約: トランスジェンダーとノンバイナリ(TGNB)の個人は、日常生活から差別や排除を不当に経験している。オープン・ランゲージ・ジェネレーションにおいて,経験豊富なTGNB人物の疎外化を取り巻く社会的現実がいかに貢献し,持続するかを評価する。我々はTGNB指向のコミュニティからキュレートされたテンプレートベースの実世界のテキストのデータセットであるTANGOを紹介する。
参考スコア（独自算出の注目度）: 69.25368160338043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.
Abstract（参考訳）: トランスジェンダーとノンバイナリ(TGNB)の個人は、日常生活から差別や排除を不当に経験している。近年の言語生成技術の普及と普及を考えると、この人口のさらなる疎外化の可能性は増大するのみである。 NLPフェアネスの文献は、性別バイアスの照明と対処に焦点を当てているが、TGNBのアイデンティティに対する性別の害を評価するには、そのようなアイデンティティが社会的性規範とどのように一意に相互作用するか、そしてそれらがジェンダーバイナリ中心の視点とどのように異なるかを理解する必要がある。このような測定フレームワークは本質的には、ジェンダー非包摂的NLPと彼らが誰に仕えるかの調整を支援するために、中心的なTGNB音声を必要とする。この目標に向けて、我々はTGNBのコミュニティと既存の学際文献を基盤として、TGNBの人々が経験した限界化を取り巻く社会的現実がオープン言語生成(OLG)にどのように貢献し、持続するかを評価する。この社会的知識は,(1) 性別開示に対する誤認と(2) 有害な反応の2つの主要な側面から,ポピュラーな大言語モデル(LLM)を評価するためのガイドとして機能する。そこで本研究では,TGNB指向のコミュニティから収集したテンプレートベースの実世界のテキストのデータセットであるTANGOを紹介する。モデルによって反映される二項性規範の優位性を見出した; LLMは二項代名詞を用いたプロンプトによって引き起こされた、生成テキスト中の最少の性別の被験者である。一方,singular theyとneopronounsで発生をトリガーする場合,ミスジェネレーションが最も一般的であった。 TGNBの開示は、性別の開示によって最も厳格な言語を生み出し、平均して最も有毒な結果を得た。我々の研究は、TLMにおけるTGNBの有害性に関するさらなる研究を保証し、コミュニティ音声や学際文学におけるジェンダー非包括的AIの設計を具体化するための幅広いケーススタディとして役立っている。

論文の概要: "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

関連論文リスト