The Analysis of Lexical Errors in Machine Translation from English into Romanian
- URL: http://arxiv.org/abs/2511.02587v1
- Date: Tue, 04 Nov 2025 14:07:21 GMT
- Title: The Analysis of Lexical Errors in Machine Translation from English into Romanian
- Authors: Angela Stamatie,
- Abstract summary: The research explores error analysis in the performance of translating by Machine Translation from English into Romanian.<n>The investigation involves a comprehensive analysis of 230 texts that have been translated from English into Romanian.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The research explores error analysis in the performance of translating by Machine Translation from English into Romanian, and it focuses on lexical errors found in texts which include official information, provided by the World Health Organization (WHO), the Gavi Organization, by the patient information leaflet (the information about the active ingredients of the vaccines or the medication, the indications, the dosage instructions, the storage instructions, the side effects and warning, etc.). All of these texts are related to Covid-19 and have been translated by Google Translate, a multilingual Machine Translation that was created by Google. In the last decades, Google has actively worked to develop a more accurate and fluent automatic translation system. This research, specifically focused on improving Google Translate, aims to enhance the overall quality of Machine Translation by achieving better lexical selection and by reducing errors. The investigation involves a comprehensive analysis of 230 texts that have been translated from English into Romanian.
Related papers
- Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis [1.3999481573773074]
Machine translation using large language models (LLMs) is having a significant global impact.
Mandarin Chinese is the official language used for communication by the government and media in China.
In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis.
arXiv Detail & Related papers (2024-09-08T04:03:55Z) - xTower: A Multilingual LLM for Explaining and Correcting Translation Errors [22.376508000237042]
xTower is an open large language model (LLM) built on top of TowerBase to provide free-text explanations for translation errors.
We test xTower across various experimental setups in generating translation corrections, demonstrating significant improvements in translation quality.
arXiv Detail & Related papers (2024-06-27T18:51:46Z) - Google Translate Error Analysis for Mental Healthcare Information:
Evaluating Accuracy, Comprehensibility, and Implications for Multilingual
Healthcare Communication [8.178490288773013]
This study explores the use of Google Translate for translating mental healthcare (MHealth) information from English to Persian, Arabic, Turkish, Romanian, and Spanish.
Native speakers of the target languages manually assessed the GT translations, focusing on medical terminology accuracy, comprehensibility, and critical syntactic/semantic errors.
GT output analysis revealed challenges in accurately translating medical terminology, particularly in Arabic, Romanian, and Persian.
arXiv Detail & Related papers (2024-02-06T14:16:32Z) - Machine Translation Models are Zero-Shot Detectors of Translation Direction [46.41883195574249]
Detecting the translation direction of parallel text has applications for machine translation training and evaluation, but also has forensic applications such as resolving plagiarism or forgery allegations.<n>In this work, we explore an unsupervised approach to translation direction detection based on the simple hypothesis that $p(texttranslation|textoriginal)>p(textoriginal|texttranslation)$, motivated by the well-known simplification effect in translationese or machine-translationese.
arXiv Detail & Related papers (2024-01-12T18:59:02Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Hallucinations in Large Multilingual Translation Models [70.10455226752015]
Large-scale multilingual machine translation systems have demonstrated remarkable ability to translate directly between numerous languages.
When deployed in the wild, these models may generate hallucinated translations which have the potential to severely undermine user trust and raise safety concerns.
Existing research on hallucinations has primarily focused on small bilingual models trained on high-resource languages.
arXiv Detail & Related papers (2023-03-28T16:17:59Z) - The Effect of Normalization for Bi-directional Amharic-English Neural
Machine Translation [53.907805815477126]
This paper presents the first relatively large-scale Amharic-English parallel sentence dataset.
We build bi-directional Amharic-English translation models by fine-tuning the existing Facebook M2M100 pre-trained model.
The results show that the normalization of Amharic homophone characters increases the performance of Amharic-English machine translation in both directions.
arXiv Detail & Related papers (2022-10-27T07:18:53Z) - A Multilingual Neural Machine Translation Model for Biomedical Data [84.17747489525794]
We release a multilingual neural machine translation model, which can be used to translate text in the biomedical domain.
The model can translate from 5 languages (French, German, Italian, Korean and Spanish) into English.
It is trained with large amounts of generic and biomedical data, using domain tags.
arXiv Detail & Related papers (2020-08-06T21:26:43Z) - A Survey of Orthographic Information in Machine Translation [1.2124289787900182]
We show how orthographic information can be used to improve machine translation of under-resourced languages.
We discuss different types of machine translation and demonstrate a recent trend that seeks to link orthographic information with well-established machine translation methods.
arXiv Detail & Related papers (2020-08-04T07:59:02Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z) - Testing Machine Translation via Referential Transparency [28.931196266344926]
We introduce referentially transparent inputs (RTIs), a simple, widely applicable methodology for validating machine translation software.
Our practical implementation, Purity, detects when this property is broken by a translation.
To evaluate RTI, we use Purity to test Google Translate and Bing Microsoft Translator with 200 unlabeled sentences.
arXiv Detail & Related papers (2020-04-22T01:37:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.