Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis
- URL: http://arxiv.org/abs/2409.04964v2
- Date: Mon, 16 Sep 2024 10:00:52 GMT
- Title: Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis
- Authors: Xuechun Wang, Rodney Beard, Rohitash Chandra,
- Abstract summary: Machine translation using large language models (LLMs) is having a significant global impact.
Mandarin Chinese is the official language used for communication by the government and media in China.
In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis.
- Score: 1.3999481573773074
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select the classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We use Google Translate to translate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations may be due to lack of contextual significance and historical knowledge of China.
Related papers
- Creative and Context-Aware Translation of East Asian Idioms with GPT-4 [20.834802250633686]
GPT-4 can generate high-quality translations of East Asian idiom.
At a low cost, our context-aware translations can achieve far more high-quality translations per idiom than the human baseline.
arXiv Detail & Related papers (2024-10-01T18:24:43Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Discourse Representation Structure Parsing for Chinese [8.846860617823005]
We explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations.
We propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance.
Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs.
arXiv Detail & Related papers (2023-06-16T09:47:45Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - An evaluation of Google Translate for Sanskrit to English translation
via sentiment and semantic analysis [0.31317409221921144]
In 2022, the Sanskrit language was added to the Google Translate engine.
In this study, we present a framework that evaluates the Google Translate for Sanskrit using the Bhagavad Gita.
arXiv Detail & Related papers (2023-02-28T04:24:55Z) - Machine Translation for Accessible Multi-Language Text Analysis [1.5484595752241124]
We show that English-trained measures computed after translation to English have adequate-to-excellent accuracy.
We show this for three major analytics -- sentiment analysis, topic analysis, and word embeddings -- over 16 languages.
arXiv Detail & Related papers (2023-01-20T04:11:38Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z) - Improving Sentiment Analysis over non-English Tweets using Multilingual
Transformers and Automatic Translation for Data-Augmentation [77.69102711230248]
We propose the use of a multilingual transformer model, that we pre-train over English tweets and apply data-augmentation using automatic translation to adapt the model to non-English languages.
Our experiments in French, Spanish, German and Italian suggest that the proposed technique is an efficient way to improve the results of the transformers over small corpora of tweets in a non-English language.
arXiv Detail & Related papers (2020-10-07T15:44:55Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z) - A Set of Recommendations for Assessing Human-Machine Parity in Language
Translation [87.72302201375847]
We reassess Hassan et al.'s investigation into Chinese to English news translation.
We show that the professional human translations contained significantly fewer errors.
arXiv Detail & Related papers (2020-04-03T17:49:56Z) - A Corpus of Adpositional Supersenses for Mandarin Chinese [15.757892250956715]
This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese.
Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria.
We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English.
arXiv Detail & Related papers (2020-03-18T18:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.