Related papers: GENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena

GENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena

URL: http://arxiv.org/abs/2108.02854v1
Date: Thu, 5 Aug 2021 21:08:45 GMT
Title: GENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena
Authors: Eva Vanmassenhove, Johanna Monti
Abstract summary: gENder-IT is an English--Italian challenge set focusing on the resolution of natural gender phenomena. It provides word-level gender tags on the English source side and multiple gender alternative, where translations needed, on the Italian target side.
Score: 2.4366811507669124
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Languages differ in terms of the absence or presence of gender features, the number of gender classes and whether and where gender features are explicitly marked. These cross-linguistic differences can lead to ambiguities that are difficult to resolve, especially for sentence-level MT systems. The identification of ambiguity and its subsequent resolution is a challenging task for which currently there aren't any specific resources or challenge sets available. In this paper, we introduce gENder-IT, an English--Italian challenge set focusing on the resolution of natural gender phenomena by providing word-level gender tags on the English source side and multiple gender alternative translations, where needed, on the Italian target side.

Related papers

Gender-Neutral Machine Translation Strategies in Practice [13.511723323294339]
Gender-inclusive machine translation (MT) should preserve gender ambiguity in the source to avoid misgendering and representational harms.<n>Here we assess the sensitivity of 21 MT systems to the need for gender neutrality in response to gender ambiguity in three translation directions of varying difficulty.
arXiv Detail & Related papers (2025-06-18T17:57:39Z)
mGeNTE: A Multilingual Resource for Gender-Neutral Language and Translation [21.461095625903504]
mGeNTE is a dataset of English-Italian/German/Spanish language pairs. It enables research in both automatic Gender-Neutral Translation (GNT) and language modelling for three grammatical gender languages.
arXiv Detail & Related papers (2025-01-16T09:35:15Z)
GFG -- Gender-Fair Generation: A CALAMITA Challenge [15.399739689743935]
Gender-fair language aims at promoting gender equality by using terms and expressions that include all identities. Gender-Fair Generation challenge intends to help shift toward gender-fair language in written communication.
arXiv Detail & Related papers (2024-12-26T10:58:40Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages [51.0349882045866]
This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender. We prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender. We find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability.
arXiv Detail & Related papers (2024-07-12T22:10:16Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Gender, names and other mysteries: Towards the ambiguous for gender-inclusive translation [7.322734499960981]
This paper explores the case where the source sentence lacks explicit gender markers, but the target sentence contains them due to richer grammatical gender. We find that many name-gender co-occurrences in MT data are not resolvable with 'unambiguous gender' in the source language. We discuss potential steps toward gender-inclusive translation which accepts the ambiguity in both gender and translation.
arXiv Detail & Related papers (2023-06-07T16:21:59Z)
Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges [11.37307883423629]
We explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models. Specifically, we focus on translation from English into Italian, a language pair representative of salient gender-related linguistic transfer problems.
arXiv Detail & Related papers (2023-01-24T15:26:36Z)
Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study. We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z)
Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation [20.39599469927542]
Gender bias is largely recognized as a problematic phenomenon affecting language technologies. Most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions. Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement.
arXiv Detail & Related papers (2022-03-18T11:14:16Z)
They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English. We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z)
Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It [18.148675498274866]
We propose schemes for incorporating explicit word-level gender inflection tags into Neural Machine Translation. We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence. We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention.
arXiv Detail & Related papers (2020-10-11T20:05:42Z)
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias [5.239305978984572]
We show that for languages with type B reflexivization, we can construct multi-task challenge datasets for detecting gender bias. In these languages, the direct translation of 'the doctor removed his mask' is not ambiguous between a coreferential reading and a disjoint reading. We present a multilingual, multi-task challenge dataset, which spans four languages and four NLP tasks.
arXiv Detail & Related papers (2020-09-24T23:47:18Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.