What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages
- URL: http://arxiv.org/abs/2407.09704v1
- Date: Fri, 12 Jul 2024 22:10:16 GMT
- Title: What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages
- Authors: Viktor Mihaylov, Aleksandar Shtedritski,
- Abstract summary: This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender.
We prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender.
We find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability.
- Score: 51.0349882045866
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender. Drawing inspiration from seminal works in psycholinguistics, particularly the study of gender's influence on language perception, we leverage multilingual LLMs to revisit and expand upon the foundational experiments of Boroditsky (2003). Employing LLMs as a novel method for examining psycholinguistic biases related to grammatical gender, we prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender. In particular, we look at adjective co-occurrences across gender and languages, and train a binary classifier to predict grammatical gender given adjectives an LLM uses to describe a noun. Surprisingly, we find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability. We show that while LLMs may describe words differently in different languages, they are biased similarly.
Related papers
- Investigating grammatical abstraction in language models using few-shot learning of novel noun gender [0.0]
We conduct a noun learning experiment to assess whether an LSTM and a decoder-only transformer can achieve human-like abstraction of grammatical gender in French.
We find that both language models effectively generalise novel noun gender from one to two learning examples and apply the learnt gender across agreement contexts.
While the generalisation behaviour of models suggests that they represent grammatical gender as an abstract category, like humans, further work is needed to explore the details.
arXiv Detail & Related papers (2024-03-15T14:25:59Z) - Gender Bias in Large Language Models across Multiple Languages [10.068466432117113]
We examine gender bias in large language models (LLMs) generated for different languages.
We use three measurements: 1) gender bias in selecting descriptive words given the gender-related context.
2) gender bias in selecting gender-related pronouns (she/he) given the descriptive words.
arXiv Detail & Related papers (2024-03-01T04:47:16Z) - The Causal Influence of Grammatical Gender on Distributional Semantics [87.8027818528463]
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science.
We offer a novel, causal graphical model that jointly represents the interactions between a noun's grammatical gender, its meaning, and adjective choice.
When we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
arXiv Detail & Related papers (2023-11-30T13:58:13Z) - Measuring Gender Bias in Word Embeddings of Gendered Languages Requires
Disentangling Grammatical Gender Signals [3.0349733976070015]
We demonstrate that word embeddings learn the association between a noun and its grammatical gender in grammatically gendered languages.
We show that disentangling grammatical gender signals from word embeddings may lead to improvement in semantic machine learning tasks.
arXiv Detail & Related papers (2022-06-03T17:11:00Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Gender Bias Hidden Behind Chinese Word Embeddings: The Case of Chinese
Adjectives [0.0]
This paper investigates gender bias in static word embeddings from a unique perspective, Chinese adjectives.
Through a comparison between the produced results and a human-scored data set, we demonstrate how gender bias encoded in word embeddings differentiates from people's attitudes.
arXiv Detail & Related papers (2021-06-01T02:12:45Z) - Quantifying Gender Bias Towards Politicians in Cross-Lingual Language
Models [104.41668491794974]
We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender.
We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians.
arXiv Detail & Related papers (2021-04-15T15:03:26Z) - Investigating Cross-Linguistic Adjective Ordering Tendencies with a
Latent-Variable Model [66.84264870118723]
We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model.
We provide strong converging evidence for the existence of universal, cross-linguistic, hierarchical adjective ordering tendencies.
arXiv Detail & Related papers (2020-10-09T18:27:55Z) - An exploration of the encoding of grammatical gender in word embeddings [0.6461556265872973]
The study of grammatical gender based on word embeddings can give insight into discussions on how grammatical genders are determined.
It is found that there is an overlap in how grammatical gender is encoded in Swedish, Danish, and Dutch embeddings.
arXiv Detail & Related papers (2020-08-05T06:01:46Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.