Measuring Misogyny in Natural Language Generation: Preliminary Results
from a Case Study on two Reddit Communities
- URL: http://arxiv.org/abs/2312.03330v1
- Date: Wed, 6 Dec 2023 07:38:46 GMT
- Title: Measuring Misogyny in Natural Language Generation: Preliminary Results
from a Case Study on two Reddit Communities
- Authors: Aaron J. Snoswell, Lucinda Nelson, Hao Xue, Flora D. Salim, Nicolas
Suzor and Jean Burgess
- Abstract summary: We consider the challenge of measuring misogyny in natural language generation.
We use data from two well-characterised Incel' communities on Reddit.
- Score: 7.499634046186994
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generic `toxicity' classifiers continue to be used for evaluating the
potential for harm in natural language generation, despite mounting evidence of
their shortcomings. We consider the challenge of measuring misogyny in natural
language generation, and argue that generic `toxicity' classifiers are
inadequate for this task. We use data from two well-characterised `Incel'
communities on Reddit that differ primarily in their degrees of misogyny to
construct a pair of training corpora which we use to fine-tune two language
models. We show that an open source `toxicity' classifier is unable to
distinguish meaningfully between generations from these models. We contrast
this with a misogyny-specific lexicon recently proposed by feminist
subject-matter experts, demonstrating that, despite the limitations of simple
lexicon-based approaches, this shows promise as a benchmark to evaluate
language models for misogyny, and that it is sensitive enough to reveal the
known differences in these Reddit communities. Our preliminary findings
highlight the limitations of a generic approach to evaluating harms, and
further emphasise the need for careful benchmark design and selection in
natural language evaluation.
Related papers
- The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification [57.06913662622832]
Gender-fair language fosters inclusion by addressing all genders or using neutral forms.
Gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns.
While we offer initial insights on the effect on German text classification, the findings likely apply to other languages.
arXiv Detail & Related papers (2024-09-26T15:08:17Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - A multitask learning framework for leveraging subjectivity of annotators to identify misogyny [47.175010006458436]
We propose a multitask learning approach to enhance the performance of the misogyny identification systems.
We incorporated diverse perspectives from annotators in our model design, considering gender and age across six profile groups.
This research advances content moderation and highlights the importance of embracing diverse perspectives to build effective online moderation systems.
arXiv Detail & Related papers (2024-06-22T15:06:08Z) - Generalizing Fairness to Generative Language Models via Reformulation of Non-discrimination Criteria [4.738231680800414]
This paper studies how to uncover and quantify the presence of gender biases in generative language models.
We derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency.
Our results address the presence of occupational gender bias within such conversational language models.
arXiv Detail & Related papers (2024-03-13T14:19:08Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Characteristics of Harmful Text: Towards Rigorous Benchmarking of
Language Models [32.960462266615096]
Large language models produce human-like text that drive a growing number of applications.
Recent literature and, increasingly, real world observations have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful.
We outline six ways of characterizing harmful text which merit explicit consideration when designing new benchmarks.
arXiv Detail & Related papers (2022-06-16T17:28:01Z) - Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias
in Speech Translation [20.39599469927542]
Gender bias is largely recognized as a problematic phenomenon affecting language technologies.
Most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions.
Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement.
arXiv Detail & Related papers (2022-03-18T11:14:16Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Towards Equal Gender Representation in the Annotations of Toxic Language
Detection [6.129776019898014]
We study the differences in the ways men and women annotate comments for toxicity.
We find that the BERT model as-sociates toxic comments containing offensive words with male annotators, causing the model to predict 67.7% of toxic comments as having been annotated by men.
We show that this disparity between gender predictions can be mitigated by removing offensive words and highly toxic comments from the training data.
arXiv Detail & Related papers (2021-06-04T00:12:38Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.