A Survey on Zero Pronoun Translation
- URL: http://arxiv.org/abs/2305.10196v1
- Date: Wed, 17 May 2023 13:19:01 GMT
- Title: A Survey on Zero Pronoun Translation
- Authors: Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi,
Zhaopeng Tu
- Abstract summary: Zero pronouns (ZPs) are frequently omitted in pro-drop languages, but should be recalled in non-pro-drop languages.
This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution.
We uncover a number of insightful findings such as: 1) ZPT is in line with the development trend of large language model; 2) data limitation causes learning bias in languages and domains; 3) performance improvements are often reported on single benchmarks, but advanced methods are still far from real-world use.
- Score: 69.09774294082965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e.g.
Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop
languages (e.g. English). This phenomenon has been studied extensively in
machine translation (MT), as it poses a significant challenge for MT systems
due to the difficulty in determining the correct antecedent for the pronoun.
This survey paper highlights the major works that have been undertaken in zero
pronoun translation (ZPT) after the neural revolution, so that researchers can
recognise the current state and future directions of this field. We provide an
organisation of the literature based on evolution, dataset, method and
evaluation. In addition, we compare and analyze competing models and evaluation
metrics on different benchmarks. We uncover a number of insightful findings
such as: 1) ZPT is in line with the development trend of large language model;
2) data limitation causes learning bias in languages and domains; 3)
performance improvements are often reported on single benchmarks, but advanced
methods are still far from real-world use; 4) general-purpose metrics are not
reliable on nuances and complexities of ZPT, emphasizing the necessity of
targeted metrics; 5) apart from commonly-cited errors, ZPs will cause risks of
gender bias.
Related papers
- Investigating Markers and Drivers of Gender Bias in Machine Translations [0.0]
Implicit gender bias in large language models (LLMs) is a well-documented problem.
We use the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks.
We find that some languages display similar patterns of pronoun use, falling into three loose groups.
We identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations.
arXiv Detail & Related papers (2024-03-18T15:54:46Z) - Salute the Classic: Revisiting Challenges of Machine Translation in the
Age of Large Language Models [91.6543868677356]
The evolution of Neural Machine Translation has been influenced by six core challenges.
These challenges include domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.
This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models.
arXiv Detail & Related papers (2024-01-16T13:30:09Z) - BLEURT Has Universal Translations: An Analysis of Automatic Metrics by
Minimum Risk Training [64.37683359609308]
In this study, we analyze various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems.
We find that certain metrics exhibit robustness defects, such as the presence of universal adversarial translations in BLEURT and BARTScore.
In-depth analysis suggests two main causes of these robustness deficits: distribution biases in the training datasets, and the tendency of the metric paradigm.
arXiv Detail & Related papers (2023-07-06T16:59:30Z) - MuLER: Detailed and Scalable Reference-based Evaluation [24.80921931416632]
We propose a novel methodology that transforms any reference-based evaluation metric for text generation into a fine-grained analysis tool.
Given a system and a metric, MuLER quantifies how much the chosen metric penalizes specific error types.
We perform experiments in both synthetic and naturalistic settings to support MuLER's validity and showcase its usability.
arXiv Detail & Related papers (2023-05-24T10:26:13Z) - Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on
POS Tagging for Non-Standardized Languages [18.210880703295253]
We finetune pretrained language models (PLMs) on seven languages from three different families.
We analyze their zero-shot performance on closely related, non-standardized varieties.
Overall, we find that the similarity between the percentage of words that get split into subwords in the source and target data is the strongest predictor for model performance on target data.
arXiv Detail & Related papers (2023-04-20T08:32:34Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z) - A Brief Survey and Comparative Study of Recent Development of Pronoun
Coreference Resolution [55.39835612617972]
Pronoun Coreference Resolution (PCR) is the task of resolving pronominal expressions to all mentions they refer to.
As one important natural language understanding (NLU) component, pronoun resolution is crucial for many downstream tasks and still challenging for existing models.
We conduct extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications.
arXiv Detail & Related papers (2020-09-27T01:40:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.