Fine-grained Gender Control in Machine Translation with Large Language Models
- URL: http://arxiv.org/abs/2407.15154v1
- Date: Sun, 21 Jul 2024 13:15:00 GMT
- Title: Fine-grained Gender Control in Machine Translation with Large Language Models
- Authors: Minwoo Lee, Hyukhun Koh, Minsung Kim, Kyomin Jung,
- Abstract summary: We tackle controlled translation in a more realistic setting of inputs with multiple entities.
Our proposed method instructs the model with fine-grained entity-level gender information to translate with correct gender inflections.
We discover an emergence of gender interference phenomenon when controlling the gender of multiple entities.
- Score: 15.63784352130237
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In machine translation, the problem of ambiguously gendered input has been pointed out, where the gender of an entity is not available in the source sentence. To address this ambiguity issue, the task of controlled translation that takes the gender of the ambiguous entity as additional input have been proposed. However, most existing works have only considered a simplified setup of one target gender for input. In this paper, we tackle controlled translation in a more realistic setting of inputs with multiple entities and propose Gender-of-Entity (GoE) prompting method for LLMs. Our proposed method instructs the model with fine-grained entity-level gender information to translate with correct gender inflections. By utilizing four evaluation benchmarks, we investigate the controlled translation capability of LLMs in multiple dimensions and find that LLMs reach state-of-the-art performance in controlled translation. Furthermore, we discover an emergence of gender interference phenomenon when controlling the gender of multiple entities. Finally, we address the limitations of existing gender accuracy evaluation metrics and propose leveraging LLMs as an evaluator for gender inflection in machine translation.
Related papers
- The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification [57.06913662622832]
Gender-fair language fosters inclusion by addressing all genders or using neutral forms.
Gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns.
While we offer initial insights on the effect on German text classification, the findings likely apply to other languages.
arXiv Detail & Related papers (2024-09-26T15:08:17Z) - GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Gender-specific Machine Translation with Large Language Models [39.49764957694078]
Decoder-only Large Language Models (LLMs) have demonstrated potential in machine translation.
LLMs offer the ability to control the properties of the output through prompts.
Our results indicate that LLaMa can generate gender-specific translations with translation accuracy and gender bias comparable to state-of-the-art multilingual NMT system.
arXiv Detail & Related papers (2023-09-06T17:24:06Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in
Multilingual Machine Translation [28.471506840241602]
Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques.
We propose a bias mitigation method based on a novel approach.
Gender-Aware Contrastive Learning, GACL, encodes contextual gender information into the representations of non-explicit gender words.
arXiv Detail & Related papers (2023-05-23T12:53:39Z) - Improving Gender Translation Accuracy with Filtered Self-Training [14.938401898546548]
Machine translation systems often output incorrect gender, even when the gender is clear from context.
We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs.
arXiv Detail & Related papers (2021-04-15T18:05:29Z) - Neural Machine Translation Doesn't Translate Gender Coreference Right
Unless You Make It [18.148675498274866]
We propose schemes for incorporating explicit word-level gender inflection tags into Neural Machine Translation.
We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence.
We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention.
arXiv Detail & Related papers (2020-10-11T20:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.