Mention Attention for Pronoun Translation
- URL: http://arxiv.org/abs/2412.14829v1
- Date: Thu, 19 Dec 2024 13:19:19 GMT
- Title: Mention Attention for Pronoun Translation
- Authors: Gongbo Tang, Christian Hardmeier,
- Abstract summary: We introduce an additional mention attention module in the decoder to pay extra attention to source mentions but not non-mention tokens.
Our mention attention module not only extracts features from source mentions, but also considers target-side context which benefits pronoun translation.
We conduct experiments on the WMT17 English-German translation task, and evaluate our models on general translation and pronoun translation.
- Score: 5.896961355859321
- License:
- Abstract: Most pronouns are referring expressions, computers need to resolve what do the pronouns refer to, and there are divergences on pronoun usage across languages. Thus, dealing with these divergences and translating pronouns is a challenge in machine translation. Mentions are referring candidates of pronouns and have closer relations with pronouns compared to general tokens. We assume that extracting additional mention features can help pronoun translation. Therefore, we introduce an additional mention attention module in the decoder to pay extra attention to source mentions but not non-mention tokens. Our mention attention module not only extracts features from source mentions, but also considers target-side context which benefits pronoun translation. In addition, we also introduce two mention classifiers to train models to recognize mentions, whose outputs guide the mention attention. We conduct experiments on the WMT17 English-German translation task, and evaluate our models on general translation and pronoun translation, using BLEU, APT, and contrastive evaluation metrics. Our proposed model outperforms the baseline Transformer model in terms of APT and BLEU scores, this confirms our hypothesis that we can improve pronoun translation by paying additional attention to source mentions, and shows that our introduced additional modules do not have negative effect on the general translation quality.
Related papers
- Mitigating Bias in Queer Representation within Large Language Models: A Collaborative Agent Approach [0.0]
Large Language Models (LLMs) often perpetuate biases in pronoun usage, leading to misrepresentation or exclusion of queer individuals.
This paper addresses the specific problem of biased pronoun usage in LLM outputs, particularly the inappropriate use of traditionally gendered pronouns.
We introduce a collaborative agent pipeline designed to mitigate these biases by analyzing and optimizing pronoun usage for inclusivity.
arXiv Detail & Related papers (2024-11-12T09:14:16Z) - A Survey on Zero Pronoun Translation [69.09774294082965]
Zero pronouns (ZPs) are frequently omitted in pro-drop languages, but should be recalled in non-pro-drop languages.
This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution.
We uncover a number of insightful findings such as: 1) ZPT is in line with the development trend of large language model; 2) data limitation causes learning bias in languages and domains; 3) performance improvements are often reported on single benchmarks, but advanced methods are still far from real-world use.
arXiv Detail & Related papers (2023-05-17T13:19:01Z) - Shapley Head Pruning: Identifying and Removing Interference in
Multilingual Transformers [54.4919139401528]
We show that it is possible to reduce interference by identifying and pruning language-specific parameters.
We show that removing identified attention heads from a fixed model improves performance for a target language on both sentence classification and structural prediction.
arXiv Detail & Related papers (2022-10-11T18:11:37Z) - Can Transformer be Too Compositional? Analysing Idiom Processing in
Neural Machine Translation [55.52888815590317]
Unlike literal expressions, idioms' meanings do not directly follow from their parts.
NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations.
We investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer.
arXiv Detail & Related papers (2022-05-30T17:59:32Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z) - Zero-pronoun Data Augmentation for Japanese-to-English Translation [15.716533830931764]
We propose a data augmentation method that provides additional training signals for the translation model to learn correlations between local context and zero pronouns.
We show that the proposed method significantly improves the accuracy of zero pronoun translation with machine translation experiments in the conversational domain.
arXiv Detail & Related papers (2021-07-01T09:17:59Z) - Do Context-Aware Translation Models Pay the Right Attention? [61.25804242929533]
Context-aware machine translation models are designed to leverage contextual information, but often fail to do so.
In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words?
We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations.
Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
arXiv Detail & Related papers (2021-05-14T17:32:24Z) - Repairing Pronouns in Translation with BERT-Based Post-Editing [7.6344611819427035]
We show that in some domains, pronoun choice can account for more than half of a NMT systems' errors.
We then investigate a possible solution: fine-tuning BERT on a pronoun prediction task using chunks of source-side sentences.
arXiv Detail & Related papers (2021-03-23T21:01:03Z) - Transformer-GCRF: Recovering Chinese Dropped Pronouns with General
Conditional Random Fields [54.03719496661691]
We present a novel framework that combines the strength of Transformer network with General Conditional Random Fields (GCRF) to model the dependencies between pronouns in neighboring utterances.
Results on three Chinese conversation datasets show that the Transformer-GCRF model outperforms the state-of-the-art dropped pronoun recovery models.
arXiv Detail & Related papers (2020-10-07T07:06:09Z) - Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation [4.775445987662862]
Machine translation systems with inadequate document understanding can make errors when translating dropped or neutral pronouns into languages with gendered pronouns.
We propose a novel cross-lingual pivoting technique for automatically producing high-quality gender labels.
arXiv Detail & Related papers (2020-06-16T02:41:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.