Politeness Stereotypes and Attack Vectors: Gender Stereotypes in
Japanese and Korean Language Models
- URL: http://arxiv.org/abs/2306.09752v1
- Date: Fri, 16 Jun 2023 10:36:18 GMT
- Title: Politeness Stereotypes and Attack Vectors: Gender Stereotypes in
Japanese and Korean Language Models
- Authors: Victor Steinborn and Antonis Maronikolakis and Hinrich Sch\"utze
- Abstract summary: We study how grammatical gender bias relating to politeness levels manifests in Japanese and Korean language models.
We find that informal polite speech is most indicative of the female grammatical gender, while rude and formal speech is most indicative of the male grammatical gender.
We find politeness levels to be an attack vector for allocational gender bias in cyberbullying detection models.
- Score: 1.5039745292757671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In efforts to keep up with the rapid progress and use of large language
models, gender bias research is becoming more prevalent in NLP. Non-English
bias research, however, is still in its infancy with most work focusing on
English. In our work, we study how grammatical gender bias relating to
politeness levels manifests in Japanese and Korean language models. Linguistic
studies in these languages have identified a connection between gender bias and
politeness levels, however it is not yet known if language models reproduce
these biases. We analyze relative prediction probabilities of the male and
female grammatical genders using templates and find that informal polite speech
is most indicative of the female grammatical gender, while rude and formal
speech is most indicative of the male grammatical gender. Further, we find
politeness levels to be an attack vector for allocational gender bias in
cyberbullying detection models. Cyberbullies can evade detection through simple
techniques abusing politeness levels. We introduce an attack dataset to (i)
identify representational gender bias across politeness levels, (ii)
demonstrate how gender biases can be abused to bypass cyberbullying detection
models and (iii) show that allocational biases can be mitigated via training on
our proposed dataset. Through our findings we highlight the importance of bias
research moving beyond its current English-centrism.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - A Multilingual Perspective on Probing Gender Bias [0.0]
Gender bias is a form of systematic negative treatment that targets individuals based on their gender.
This thesis investigates the nuances of how gender bias is expressed through language and within language technologies.
arXiv Detail & Related papers (2024-03-15T21:35:21Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Efficient Gender Debiasing of Pre-trained Indic Language Models [0.0]
The gender bias present in the data on which language models are pre-trained gets reflected in the systems that use these models.
In our paper, we measure gender bias associated with occupations in Hindi language models.
Our results reflect that the bias is reduced post-introduction of our proposed mitigation techniques.
arXiv Detail & Related papers (2022-09-08T09:15:58Z) - Don't Forget About Pronouns: Removing Gender Bias in Language Models
Without Losing Factual Gender Information [4.391102490444539]
We focus on two types of such signals in English texts: factual gender information and gender bias.
We aim to diminish the stereotypical bias in the representations while preserving the factual gender signal.
arXiv Detail & Related papers (2022-06-21T21:38:25Z) - Quantifying Gender Bias Towards Politicians in Cross-Lingual Language
Models [104.41668491794974]
We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender.
We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians.
arXiv Detail & Related papers (2021-04-15T15:03:26Z) - Type B Reflexivization as an Unambiguous Testbed for Multilingual
Multi-Task Gender Bias [5.239305978984572]
We show that for languages with type B reflexivization, we can construct multi-task challenge datasets for detecting gender bias.
In these languages, the direct translation of 'the doctor removed his mask' is not ambiguous between a coreferential reading and a disjoint reading.
We present a multilingual, multi-task challenge dataset, which spans four languages and four NLP tasks.
arXiv Detail & Related papers (2020-09-24T23:47:18Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.