Human Evaluation of English--Irish Transformer-Based NMT
- URL: http://arxiv.org/abs/2403.02366v1
- Date: Mon, 4 Mar 2024 11:45:46 GMT
- Title: Human Evaluation of English--Irish Transformer-Based NMT
- Authors: S\'eamus Lankford, Haithem Afli and Andy Way
- Abstract summary: Best-performing Transformer system significantly reduces both accuracy and errors when compared with an RNN-based model.
When benchmarked against Google Translate, our translation engines demonstrated significant improvements.
- Score: 2.648836772989769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, a human evaluation is carried out on how hyperparameter
settings impact the quality of Transformer-based Neural Machine Translation
(NMT) for the low-resourced English--Irish pair. SentencePiece models using
both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations
in model architectures included modifying the number of layers, evaluating the
optimal number of heads for attention and testing various regularisation
techniques. The greatest performance improvement was recorded for a
Transformer-optimized model with a 16k BPE subword model. Compared with a
baseline Recurrent Neural Network (RNN) model, a Transformer-optimized model
demonstrated a BLEU score improvement of 7.8 points. When benchmarked against
Google Translate, our translation engines demonstrated significant
improvements. Furthermore, a quantitative fine-grained manual evaluation was
conducted which compared the performance of machine translation systems. Using
the Multidimensional Quality Metrics (MQM) error taxonomy, a human evaluation
of the error types generated by an RNN-based system and a Transformer-based
system was explored. Our findings show the best-performing Transformer system
significantly reduces both accuracy and fluency errors when compared with an
RNN-based model.
Related papers
- Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning [73.73967342609603]
We introduce a predictor-corrector learning framework to minimize truncation errors.
We also propose an exponential moving average-based coefficient learning method to strengthen our higher-order predictor.
Our model surpasses a robust 3.8B DeepNet by an average of 2.9 SacreBLEU, using only 1/3 parameters.
arXiv Detail & Related papers (2024-11-05T12:26:25Z) - Transformers for Low-Resource Languages:Is F\'eidir Linn! [2.648836772989769]
In general, neural translation models often under perform on language pairs with insufficient training data.
We demonstrate that choosing appropriate parameters leads to considerable performance improvements.
A Transformer optimized model demonstrated a BLEU score improvement of 7.8 points when compared with a baseline RNN model.
arXiv Detail & Related papers (2024-03-04T12:29:59Z) - Enhancing Neural Machine Translation of Low-Resource Languages: Corpus
Development, Human Evaluation and Explainable AI Architectures [0.0]
The Transformer architecture stands out as the gold standard, especially for high-resource language pairs.
The scarcity of parallel datasets for low-resource languages can hinder machine translation development.
This thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models.
arXiv Detail & Related papers (2024-03-03T18:08:30Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - Minimum Bayes Risk Decoding with Neural Metrics of Translation Quality [16.838064121696274]
This work applies Minimum Bayes Risk decoding to optimize diverse automated metrics of translation quality.
Experiments show that the combination of a neural translation model with a neural reference-based metric, BLEURT, results in significant improvement in automatic and human evaluations.
arXiv Detail & Related papers (2021-11-17T20:48:02Z) - Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction.
It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition.
We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z) - Non-Parametric Online Learning from Human Feedback for Neural Machine
Translation [54.96594148572804]
We study the problem of online learning with human feedback in the human-in-the-loop machine translation.
Previous methods require online model updating or additional translation memory networks to achieve high-quality performance.
We propose a novel non-parametric online learning method without changing the model structure.
arXiv Detail & Related papers (2021-09-23T04:26:15Z) - Bayesian Transformer Language Models for Speech Recognition [59.235405107295655]
State-of-the-art neural language models (LMs) represented by Transformers are highly complex.
This paper proposes a full Bayesian learning framework for Transformer LM estimation.
arXiv Detail & Related papers (2021-02-09T10:55:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.