Related papers: Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis

Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis

URL: http://arxiv.org/abs/2409.04964v2
Date: Mon, 16 Sep 2024 10:00:52 GMT
Title: Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis
Authors: Xuechun Wang, Rodney Beard, Rohitash Chandra,
Abstract summary: Machine translation using large language models (LLMs) is having a significant global impact. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis.
Score: 1.3999481573773074
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select the classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We use Google Translate to translate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations may be due to lack of contextual significance and historical knowledge of China.

Related papers

Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering [68.3400058037817]
We introduce TREQA (Translation Evaluation via Question-Answering), a framework that extrinsically evaluates translation quality. We show that TREQA is competitive with and, in some cases, outperforms state-of-the-art neural and LLM-based metrics in ranking alternative paragraph-level translations.
arXiv Detail & Related papers (2025-04-10T09:24:54Z)
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses [0.17999333451993949]
Large Language models (LLMs) have been prominent for language translation, including low-resource languages. This study uses semantic and sentiment analysis of selected LLMs for Indian languages, including Sanskrit, Telugu and Hindi.
arXiv Detail & Related papers (2025-03-27T11:35:40Z)
DRT: Deep Reasoning Translation via Long Chain-of-Thought [89.48208612476068]
In this paper, we introduce DRT, an attempt to bring the success of long CoT to neural machine translation (MT) We first mine sentences containing similes or metaphors from existing literature books, and then develop a multi-agent framework to translate these sentences via long thought. Using Qwen2.5 and LLama-3.1 as the backbones, DRT models can learn the thought process during machine translation.
arXiv Detail & Related papers (2024-12-23T11:55:33Z)
The Role of Handling Attributive Nouns in Improving Chinese-To-English Machine Translation [5.64086253718739]
We specifically target the translation challenges posed by attributive nouns in Chinese, which frequently cause ambiguities in English translation. By manually inserting the omitted particle X ('DE'), we improve how this critical function word is handled.
arXiv Detail & Related papers (2024-12-18T20:37:52Z)
Creative and Context-Aware Translation of East Asian Idioms with GPT-4 [20.834802250633686]
GPT-4 can generate high-quality translations of East Asian idiom. At a low cost, our context-aware translations can achieve far more high-quality translations per idiom than the human baseline.
arXiv Detail & Related papers (2024-10-01T18:24:43Z)
Crossing the Threshold: Idiomatic Machine Translation through Retrieval Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues. We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations. To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z)
Discourse Representation Structure Parsing for Chinese [8.846860617823005]
We explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations. We propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance. Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs.
arXiv Detail & Related papers (2023-06-16T09:47:45Z)
The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations. An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z)
An evaluation of Google Translate for Sanskrit to English translation via sentiment and semantic analysis [0.31317409221921144]
In 2022, the Sanskrit language was added to the Google Translate engine. In this study, we present a framework that evaluates the Google Translate for Sanskrit using the Bhagavad Gita.
arXiv Detail & Related papers (2023-02-28T04:24:55Z)
Machine Translation for Accessible Multi-Language Text Analysis [1.5484595752241124]
We show that English-trained measures computed after translation to English have adequate-to-excellent accuracy. We show this for three major analytics -- sentiment analysis, topic analysis, and word embeddings -- over 16 languages.
arXiv Detail & Related papers (2023-01-20T04:11:38Z)
ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee. It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z)
Improving Sentiment Analysis over non-English Tweets using Multilingual Transformers and Automatic Translation for Data-Augmentation [77.69102711230248]
We propose the use of a multilingual transformer model, that we pre-train over English tweets and apply data-augmentation using automatic translation to adapt the model to non-English languages. Our experiments in French, Spanish, German and Italian suggest that the proposed technique is an efficient way to improve the results of the transformers over small corpora of tweets in a non-English language.
arXiv Detail & Related papers (2020-10-07T15:44:55Z)
Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models. In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them. We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
A Set of Recommendations for Assessing Human-Machine Parity in Language Translation [87.72302201375847]
We reassess Hassan et al.'s investigation into Chinese to English news translation. We show that the professional human translations contained significantly fewer errors.
arXiv Detail & Related papers (2020-04-03T17:49:56Z)
A Corpus of Adpositional Supersenses for Mandarin Chinese [15.757892250956715]
This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese. Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria. We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English.
arXiv Detail & Related papers (2020-03-18T18:59:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.