Related papers: Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models

Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models

URL: http://arxiv.org/abs/2306.09752v1
Date: Fri, 16 Jun 2023 10:36:18 GMT
Title: Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models
Authors: Victor Steinborn and Antonis Maronikolakis and Hinrich Sch\"utze
Abstract summary: We study how grammatical gender bias relating to politeness levels manifests in Japanese and Korean language models. We find that informal polite speech is most indicative of the female grammatical gender, while rude and formal speech is most indicative of the male grammatical gender. We find politeness levels to be an attack vector for allocational gender bias in cyberbullying detection models.
Score: 1.5039745292757671
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In efforts to keep up with the rapid progress and use of large language models, gender bias research is becoming more prevalent in NLP. Non-English bias research, however, is still in its infancy with most work focusing on English. In our work, we study how grammatical gender bias relating to politeness levels manifests in Japanese and Korean language models. Linguistic studies in these languages have identified a connection between gender bias and politeness levels, however it is not yet known if language models reproduce these biases. We analyze relative prediction probabilities of the male and female grammatical genders using templates and find that informal polite speech is most indicative of the female grammatical gender, while rude and formal speech is most indicative of the male grammatical gender. Further, we find politeness levels to be an attack vector for allocational gender bias in cyberbullying detection models. Cyberbullies can evade detection through simple techniques abusing politeness levels. We introduce an attack dataset to (i) identify representational gender bias across politeness levels, (ii) demonstrate how gender biases can be abused to bypass cyberbullying detection models and (iii) show that allocational biases can be mitigated via training on our proposed dataset. Through our findings we highlight the importance of bias research moving beyond its current English-centrism.

Related papers

EuroGEST: Investigating gender stereotypes in multilingual language models [58.871032460235575]
We introduce EuroGEST, a dataset designed to measure gender-stereotypical reasoning in LLMs across English and 29 European languages.<n>We show that the strongest stereotypes in all models across all languages are that women are 'beautiful', 'empathetic' and 'neat' and men are 'leaders','strong, tough' and 'professional'
arXiv Detail & Related papers (2025-06-04T11:58:18Z)
Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs [15.783346695504344]
We study the first study of multilingual intersecting country and gender biases.<n>We construct a benchmark of prompts in English, Spanish and German, using 25 countries and four pronoun sets.<n>We find that even when models show parity for gender or country individually, intersectional occupational biases based on both country and gender persist.
arXiv Detail & Related papers (2025-05-05T08:40:51Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages [51.0349882045866]
This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender. We prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender. We find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability.
arXiv Detail & Related papers (2024-07-12T22:10:16Z)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora [9.959039325564744]
Large language models (LLMs) often inherit and amplify social biases embedded in their training data.<n>Gender bias is the association of specific roles or traits with a particular gender.<n>Gender representation bias is the unequal frequency of references to individuals of different genders.
arXiv Detail & Related papers (2024-06-19T16:30:58Z)
A Multilingual Perspective on Probing Gender Bias [0.0]
Gender bias is a form of systematic negative treatment that targets individuals based on their gender. This thesis investigates the nuances of how gender bias is expressed through language and within language technologies.
arXiv Detail & Related papers (2024-03-15T21:35:21Z)
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do. We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models. Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z)
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases. This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Efficient Gender Debiasing of Pre-trained Indic Language Models [0.0]
The gender bias present in the data on which language models are pre-trained gets reflected in the systems that use these models. In our paper, we measure gender bias associated with occupations in Hindi language models. Our results reflect that the bias is reduced post-introduction of our proposed mitigation techniques.
arXiv Detail & Related papers (2022-09-08T09:15:58Z)
Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information [4.391102490444539]
We focus on two types of such signals in English texts: factual gender information and gender bias. We aim to diminish the stereotypical bias in the representations while preserving the factual gender signal.
arXiv Detail & Related papers (2022-06-21T21:38:25Z)
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models [104.41668491794974]
We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender. We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians.
arXiv Detail & Related papers (2021-04-15T15:03:26Z)
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias [5.239305978984572]
We show that for languages with type B reflexivization, we can construct multi-task challenge datasets for detecting gender bias. In these languages, the direct translation of 'the doctor removed his mask' is not ambiguous between a coreferential reading and a disjoint reading. We present a multilingual, multi-task challenge dataset, which spans four languages and four NLP tasks.
arXiv Detail & Related papers (2020-09-24T23:47:18Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.