Evaluating Amharic Machine Translation
- URL: http://arxiv.org/abs/2003.14386v1
- Date: Tue, 31 Mar 2020 17:30:08 GMT
- Title: Evaluating Amharic Machine Translation
- Authors: Asmelash Teka Hadgu, Adam Beaudoin, Abel Aregawi
- Abstract summary: We develop and share a dataset to automatically evaluate the quality of machine translation systems for Amharic.
BLEU score results show that the results for Amharic translation are promising but still low.
- Score: 0.4297070083645048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine translation (MT) systems are now able to provide very accurate
results for high resource language pairs. However, for many low resource
languages, MT is still under active research. In this paper, we develop and
share a dataset to automatically evaluate the quality of MT systems for
Amharic. We compare two commercially available MT systems that support
translation of Amharic to and from English to assess the current state of MT
for Amharic. The BLEU score results show that the results for Amharic
translation are promising but still low. We hope that this dataset will be
useful to the research community both in academia and industry as a benchmark
to evaluate Amharic MT systems.
Related papers
- Evaluating Automatic Metrics with Incremental Machine Translation Systems [55.78547133890403]
We introduce a dataset comprising commercial machine translations, gathered weekly over six years across 12 translation directions.
We assume commercial systems improve over time, which enables us to evaluate machine translation (MT) metrics based on their preference for more recent translations.
Our study confirms several previous findings in MT metrics research and demonstrates the dataset's value as a testbed for metric evaluation.
arXiv Detail & Related papers (2024-07-03T17:04:17Z) - EthioMT: Parallel Corpus for Low-resource Ethiopian Languages [49.80726355048843]
We introduce EthioMT -- a new parallel corpus for 15 languages.
We also create a new benchmark by collecting a dataset for better-researched languages in Ethiopia.
We evaluate the newly collected corpus and the benchmark dataset for 23 Ethiopian languages using transformer and fine-tuning approaches.
arXiv Detail & Related papers (2024-03-28T12:26:45Z) - An approach for mistranslation removal from popular dataset for Indic MT
Task [5.4755933832880865]
We propose an algorithm to remove mistranslations from the training corpus and evaluate its performance and efficiency.
Two Indic languages (ILs), namely, Hindi (HIN) and Odia (ODI) are chosen for the experiment.
The quality of the translations in the experiment is evaluated using standard metrics such as BLEU, METEOR, and RIBES.
arXiv Detail & Related papers (2024-01-12T06:37:19Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Discourse Centric Evaluation of Machine Translation with a Densely
Annotated Parallel Corpus [82.07304301996562]
This paper presents a new dataset with rich discourse annotations, built upon the large-scale parallel corpus BWB introduced in Jiang et al.
We investigate the similarities and differences between the discourse structures of source and target languages.
We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures.
arXiv Detail & Related papers (2023-05-18T17:36:41Z) - Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level.
We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks.
Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z) - Neural Machine Translation Quality and Post-Editing Performance [0.04654201857155095]
We focus on neural MT (NMT) of high quality, which has become the state-of-the-art approach since then and also got adopted by most translation companies.
Across all models, we found that better MT systems indeed lead to fewer changes in the sentences in this industry setting.
Contrary to the results on phrase-based MT, BLEU is definitely not a stable predictor of the time or final output quality.
arXiv Detail & Related papers (2021-09-10T17:56:02Z) - Difficulty-Aware Machine Translation Evaluation [19.973201669851626]
We propose a novel difficulty-aware machine translation evaluation metric.
A translation that fails to be predicted by most MT systems will be treated as a difficult one and assigned a large weight in the final score function.
Our proposed method performs well even when all the MT systems are very competitive.
arXiv Detail & Related papers (2021-07-30T02:45:36Z) - Unsupervised Quality Estimation for Neural Machine Translation [63.38918378182266]
Existing approaches require large amounts of expert annotated data, computation and time for training.
We devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required.
We achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models.
arXiv Detail & Related papers (2020-05-21T12:38:06Z) - Amharic-Arabic Neural Machine Translation [0.0]
Two Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) models are developed.
A small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile.
LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system.
arXiv Detail & Related papers (2019-12-26T15:41:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.