Google Translate Error Analysis for Mental Healthcare Information:
Evaluating Accuracy, Comprehensibility, and Implications for Multilingual
Healthcare Communication
- URL: http://arxiv.org/abs/2402.04023v1
- Date: Tue, 6 Feb 2024 14:16:32 GMT
- Title: Google Translate Error Analysis for Mental Healthcare Information:
Evaluating Accuracy, Comprehensibility, and Implications for Multilingual
Healthcare Communication
- Authors: Jaleh Delfani, Constantin Orasan, Hadeel Saadany, Ozlem Temizoz,
Eleanor Taylor-Stilgoe, Diptesh Kanojia, Sabine Braun, Barbara Schouten
- Abstract summary: This study explores the use of Google Translate for translating mental healthcare (MHealth) information from English to Persian, Arabic, Turkish, Romanian, and Spanish.
Native speakers of the target languages manually assessed the GT translations, focusing on medical terminology accuracy, comprehensibility, and critical syntactic/semantic errors.
GT output analysis revealed challenges in accurately translating medical terminology, particularly in Arabic, Romanian, and Persian.
- Score: 8.178490288773013
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study explores the use of Google Translate (GT) for translating mental
healthcare (MHealth) information and evaluates its accuracy, comprehensibility,
and implications for multilingual healthcare communication through analysing GT
output in the MHealth domain from English to Persian, Arabic, Turkish,
Romanian, and Spanish. Two datasets comprising MHealth information from the UK
National Health Service website and information leaflets from The Royal College
of Psychiatrists were used. Native speakers of the target languages manually
assessed the GT translations, focusing on medical terminology accuracy,
comprehensibility, and critical syntactic/semantic errors. GT output analysis
revealed challenges in accurately translating medical terminology, particularly
in Arabic, Romanian, and Persian. Fluency issues were prevalent across various
languages, affecting comprehension, mainly in Arabic and Spanish. Critical
errors arose in specific contexts, such as bullet-point formatting,
specifically in Persian, Turkish, and Romanian. Although improvements are seen
in longer-text translations, there remains a need to enhance accuracy in
medical and mental health terminology and fluency, whilst also addressing
formatting issues for a more seamless user experience. The findings highlight
the need to use customised translation engines for Mhealth translation and the
challenges when relying solely on machine-translated medical content,
emphasising the crucial role of human reviewers in multilingual healthcare
communication.
Related papers
- CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models [59.8529196670565]
CRAT is a novel multi-agent translation framework that leverages RAG and causality-enhanced self-reflection to address translation challenges.
Our results show that CRAT significantly improves translation accuracy, particularly in handling context-sensitive terms and emerging vocabulary.
arXiv Detail & Related papers (2024-10-28T14:29:11Z) - On Creating an English-Thai Code-switched Machine Translation in Medical Domain [2.0737832185611524]
Machine translation (MT) in the medical domain plays a pivotal role in enhancing healthcare quality and disseminating medical knowledge.
Despite advancements in English-Thai MT technology, common MT approaches often underperform in the medical field due to their inability to precisely translate medical terminologies.
Our research prioritizes not merely improving translation accuracy but also maintaining medical terminology in English.
arXiv Detail & Related papers (2024-10-21T17:25:32Z) - Severity Prediction in Mental Health: LLM-based Creation, Analysis,
Evaluation of a Novel Multilingual Dataset [3.4146360486107987]
Large Language Models (LLMs) are increasingly integrated into various medical fields, including mental health support systems.
We present a novel multilingual adaptation of widely-used mental health datasets, translated from English into six languages.
This dataset enables a comprehensive evaluation of LLM performance in detecting mental health conditions and assessing their severity across multiple languages.
arXiv Detail & Related papers (2024-09-25T22:14:34Z) - BiMediX: Bilingual Medical Mixture of Experts LLM [94.85518237963535]
We introduce BiMediX, the first bilingual medical mixture of experts LLM designed for seamless interaction in both English and Arabic.
Our model facilitates a wide range of medical interactions in English and Arabic, including multi-turn chats to inquire about additional details.
We propose a semi-automated English-to-Arabic translation pipeline with human refinement to ensure high-quality translations.
arXiv Detail & Related papers (2024-02-20T18:59:26Z) - Multilingual Simplification of Medical Texts [49.469685530201716]
We introduce MultiCochrane, the first sentence-aligned multilingual text simplification dataset for the medical domain in four languages.
We evaluate fine-tuned and zero-shot models across these languages, with extensive human assessments and analyses.
Although models can now generate viable simplified texts, we identify outstanding challenges that this dataset might be used to address.
arXiv Detail & Related papers (2023-05-21T18:25:07Z) - Differentiate ChatGPT-generated and Human-written Medical Texts [8.53416950968806]
This research is among the first studies on responsible and ethical AIGC (Artificial Intelligence Generated Content) in medicine.
We focus on analyzing the differences between medical texts written by human experts and generated by ChatGPT.
In the next step, we analyze the linguistic features of these two types of content and uncover differences in vocabulary, part-of-speech, dependency, sentiment, perplexity, etc.
arXiv Detail & Related papers (2023-04-23T07:38:07Z) - Machine Translation for Accessible Multi-Language Text Analysis [1.5484595752241124]
We show that English-trained measures computed after translation to English have adequate-to-excellent accuracy.
We show this for three major analytics -- sentiment analysis, topic analysis, and word embeddings -- over 16 languages.
arXiv Detail & Related papers (2023-01-20T04:11:38Z) - A Semi-supervised Approach for a Better Translation of Sentiment in
Dialectical Arabic UGT [2.6763498831034034]
We introduce a semi-supervised approach that exploits both monolingual and parallel data for training an NMT system.
We will show that our proposed system can significantly help with correcting sentiment errors detected in the online translation of dialectical Arabic UGT.
arXiv Detail & Related papers (2022-10-21T11:55:55Z) - Pragmatic information in translation: a corpus-based study of tense and
mood in English and German [70.3497683558609]
Grammatical tense and mood are important linguistic phenomena to consider in natural language processing (NLP) research.
We consider the correspondence between English and German tense and mood in translation.
Of particular importance is the challenge of modeling tense and mood in rule-based, phrase-based statistical and neural machine translation.
arXiv Detail & Related papers (2020-07-10T08:15:59Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z) - Self-Attention with Cross-Lingual Position Representation [112.05807284056337]
Position encoding (PE) is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences.
Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem.
We augment SANs with emphcross-lingual position representations to model the bilingually aware latent structure for the input sentence.
arXiv Detail & Related papers (2020-04-28T05:23:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.