Bring More Attention to Syntactic Symmetry for Automatic Postediting of
High-Quality Machine Translations
- URL: http://arxiv.org/abs/2305.10557v2
- Date: Sat, 17 Jun 2023 09:58:02 GMT
- Title: Bring More Attention to Syntactic Symmetry for Automatic Postediting of
High-Quality Machine Translations
- Authors: Baikjin Jung, Myungji Lee, Jong-Hyeok Lee, Yunsu Kim
- Abstract summary: We propose a linguistically motivated method of regularization that is expected to enhance APE models' understanding of the target language.
Our analysis of experimental results demonstrates that the proposed method helps improving the state-of-the-art architecture's APE quality for high-quality MTs.
- Score: 4.217162744375792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic postediting (APE) is an automated process to refine a given machine
translation (MT). Recent findings present that existing APE systems are not
good at handling high-quality MTs even for a language pair with abundant data
resources, English-to-German: the better the given MT is, the harder it is to
decide what parts to edit and how to fix these errors. One possible solution to
this problem is to instill deeper knowledge about the target language into the
model. Thus, we propose a linguistically motivated method of regularization
that is expected to enhance APE models' understanding of the target language: a
loss function that encourages symmetric self-attention on the given MT. Our
analysis of experimental results demonstrates that the proposed method helps
improving the state-of-the-art architecture's APE quality for high-quality MTs.
Related papers
- Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu [53.437954702561065]
In-context machine translation (MT) with large language models (LLMs) is a promising approach for low-resource MT.
This study systematically investigates how each resource and its quality affects the translation performance, with the Manchu language.
Our results indicate that high-quality dictionaries and good parallel examples are very helpful, while grammars hardly help.
arXiv Detail & Related papers (2025-02-17T14:53:49Z) - MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators [53.91199933655421]
Large Language Models (LLMs) have shown significant potential as judges for Machine Translation (MT) quality assessment.
We introduce a universal and training-free framework, $textbfMQM-APE, based on the idea of filtering out non-impactful errors.
Experiments show that our approach consistently improves both the reliability and quality of error spans against GEMBA-MQM.
arXiv Detail & Related papers (2024-09-22T06:43:40Z) - Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations [14.149224539732913]
Machine Translation remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems.
This work exploits the complementary strengths of LLMs and supervised MT by guiding LLMs to automatically post-edit MT with external feedback on its quality.
Experiments on Chinese-English, English-German, and English-Russian MQM data, we demonstrate that prompting LLMs to post-edit MT improves TER, BLEU and COMET scores.
Fine-tuning helps integrate fine-grained feedback more effectively and further improves translation quality based on both automatic and human evaluation.
arXiv Detail & Related papers (2024-04-11T15:47:10Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Perturbation-based QE: An Explainable, Unsupervised Word-level Quality
Estimation Method for Blackbox Machine Translation [12.376309678270275]
Perturbation-based QE works simply by analyzing MT system output on perturbed input source sentences.
Our approach is better at detecting gender bias and word-sense-disambiguation errors in translation than supervised QE.
arXiv Detail & Related papers (2023-05-12T13:10:57Z) - Evaluating and Improving the Coreference Capabilities of Machine
Translation Models [30.60934078720647]
Machine translation requires a wide range of linguistic capabilities.
Current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora.
arXiv Detail & Related papers (2023-02-16T18:16:09Z) - An Empirical Study of Automatic Post-Editing [56.86393786396992]
APE aims to reduce manual post-editing efforts by automatically correcting errors in machine-translated output.
To alleviate the lack of genuine training data, most of the current APE systems employ data augmentation methods to generate large-scale artificial corpora.
We study the outputs of the state-of-art APE model on a difficult APE dataset to analyze the problems in existing APE systems.
arXiv Detail & Related papers (2022-09-16T07:38:27Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.