Measuring Sentiment Bias in Machine Translation
- URL: http://arxiv.org/abs/2306.07152v1
- Date: Mon, 12 Jun 2023 14:40:29 GMT
- Title: Measuring Sentiment Bias in Machine Translation
- Authors: Kai Hartung, Aaricia Herygers, Shubham Kurlekar, Khabbab Zakaria,
Taylan Volkan, S\"oren Gr\"ottrup, Munir Georges
- Abstract summary: Biases induced to text by generative models have become an increasingly large topic in recent years.
We compare three open access machine translation models for five different languages on two parallel corpora.
Though our statistic test indicate shifts in the label probability distributions, we find none that appears consistent enough to assume a bias induced by the translation process.
- Score: 1.567333808864147
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Biases induced to text by generative models have become an increasingly large
topic in recent years. In this paper we explore how machine translation might
introduce a bias in sentiments as classified by sentiment analysis models. For
this, we compare three open access machine translation models for five
different languages on two parallel corpora to test if the translation process
causes a shift in sentiment classes recognized in the texts. Though our
statistic test indicate shifts in the label probability distributions, we find
none that appears consistent enough to assume a bias induced by the translation
process.
Related papers
- Investigating Markers and Drivers of Gender Bias in Machine Translations [0.0]
Implicit gender bias in large language models (LLMs) is a well-documented problem.
We use the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks.
We find that some languages display similar patterns of pronoun use, falling into three loose groups.
We identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations.
arXiv Detail & Related papers (2024-03-18T15:54:46Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Lost In Translation: Generating Adversarial Examples Robust to
Round-Trip Translation [66.33340583035374]
We present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation.
We demonstrate that 6 state-of-the-art text-based adversarial attacks do not maintain their efficacy after round-trip translation.
We introduce an intervention-based solution to this problem, by integrating Machine Translation into the process of adversarial example generation.
arXiv Detail & Related papers (2023-07-24T04:29:43Z) - On the Efficacy of Sampling Adapters [82.5941326570812]
We propose a unified framework for understanding sampling adapters.
We argue that the shift they enforce can be viewed as a trade-off between precision and recall.
We find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution.
arXiv Detail & Related papers (2023-07-07T17:59:12Z) - Do GPTs Produce Less Literal Translations? [20.095646048167612]
Large Language Models (LLMs) have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks.
We find that translations out of English (E-X) from GPTs tend to be less literal, while exhibiting similar or better scores on Machine Translation quality metrics.
arXiv Detail & Related papers (2023-05-26T10:38:31Z) - Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level.
We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks.
Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z) - How sensitive are translation systems to extra contexts? Mitigating
gender bias in Neural Machine Translation models through relevant contexts [11.684346035745975]
A growing number of studies highlight the inherent gender bias that Neural Machine Translation models incorporate during training.
We investigate whether these models can be instructed to fix their bias during inference using targeted, guided instructions as contexts.
We observe large improvements in reducing the gender bias in translations, across three popular test suites.
arXiv Detail & Related papers (2022-05-22T06:31:54Z) - Towards Debiasing Translation Artifacts [15.991970288297443]
We propose a novel approach to reducing translationese by extending an established bias-removal technique.
We use the Iterative Null-space Projection (INLP) algorithm, and show by measuring classification accuracy before and after debiasing, that translationese is reduced at both sentence and word level.
To the best of our knowledge, this is the first study to debias translationese as represented in latent embedding space.
arXiv Detail & Related papers (2022-05-16T21:46:51Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.