Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis
- URL: http://arxiv.org/abs/2409.04964v2
- Date: Mon, 16 Sep 2024 10:00:52 GMT
- Title: Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis
- Authors: Xuechun Wang, Rodney Beard, Rohitash Chandra,
- Abstract summary: Machine translation using large language models (LLMs) is having a significant global impact.
Mandarin Chinese is the official language used for communication by the government and media in China.
In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis.
- Score: 1.3999481573773074
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select the classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We use Google Translate to translate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations may be due to lack of contextual significance and historical knowledge of China.
Related papers
- DRT: Deep Reasoning Translation via Long Chain-of-Thought [89.48208612476068]
In this paper, we introduce DRT, an attempt to bring the success of long CoT to neural machine translation (MT)
We first mine sentences containing similes or metaphors from existing literature books, and then develop a multi-agent framework to translate these sentences via long thought.
Using Qwen2.5 and LLama-3.1 as the backbones, DRT models can learn the thought process during machine translation.
arXiv Detail & Related papers (2024-12-23T11:55:33Z) - The Role of Handling Attributive Nouns in Improving Chinese-To-English Machine Translation [5.64086253718739]
We specifically target the translation challenges posed by attributive nouns in Chinese, which frequently cause ambiguities in English translation.
By manually inserting the omitted particle X ('DE'), we improve how this critical function word is handled.
arXiv Detail & Related papers (2024-12-18T20:37:52Z) - Creative and Context-Aware Translation of East Asian Idioms with GPT-4 [20.834802250633686]
GPT-4 can generate high-quality translations of East Asian idiom.
At a low cost, our context-aware translations can achieve far more high-quality translations per idiom than the human baseline.
arXiv Detail & Related papers (2024-10-01T18:24:43Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Discourse Representation Structure Parsing for Chinese [8.846860617823005]
We explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations.
We propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance.
Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs.
arXiv Detail & Related papers (2023-06-16T09:47:45Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - An evaluation of Google Translate for Sanskrit to English translation
via sentiment and semantic analysis [0.31317409221921144]
In 2022, the Sanskrit language was added to the Google Translate engine.
In this study, we present a framework that evaluates the Google Translate for Sanskrit using the Bhagavad Gita.
arXiv Detail & Related papers (2023-02-28T04:24:55Z) - Machine Translation for Accessible Multi-Language Text Analysis [1.5484595752241124]
We show that English-trained measures computed after translation to English have adequate-to-excellent accuracy.
We show this for three major analytics -- sentiment analysis, topic analysis, and word embeddings -- over 16 languages.
arXiv Detail & Related papers (2023-01-20T04:11:38Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z) - A Set of Recommendations for Assessing Human-Machine Parity in Language
Translation [87.72302201375847]
We reassess Hassan et al.'s investigation into Chinese to English news translation.
We show that the professional human translations contained significantly fewer errors.
arXiv Detail & Related papers (2020-04-03T17:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.