Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving
- URL: http://arxiv.org/abs/2408.09945v4
- Date: Mon, 30 Dec 2024 07:26:14 GMT
- Title: Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving
- Authors: Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang,
- Abstract summary: Large language models (LLMs) with impressive multilingual capabilities may bring a ray of hope to achieve this extreme translation demand.
This paper first introduces a suitable benchmark (PoetMT) where each Chinese poetry has a recognized elegant translation.
We propose a new metric based on GPT-4 to evaluate the extent to which current LLMs can meet these demands.
- Score: 43.148203559785095
- License:
- Abstract: Different from the traditional translation tasks, classical Chinese poetry translation requires both adequacy and fluency in translating culturally and historically significant content and linguistic poetic elegance. Large language models (LLMs) with impressive multilingual capabilities may bring a ray of hope to achieve this extreme translation demand. This paper first introduces a suitable benchmark (PoetMT) where each Chinese poetry has a recognized elegant translation. Meanwhile, we propose a new metric based on GPT-4 to evaluate the extent to which current LLMs can meet these demands. Our empirical evaluation reveals that the existing LLMs fall short in the challenging task. Hence, we propose a Retrieval-Augmented Machine Translation (RAT) method which incorporates knowledge related to classical poetry for advancing the translation of Chinese Poetry in LLMs. Experimental results show that RAT consistently outperforms all comparison methods regarding wildly used BLEU, COMET, BLEURT, our proposed metric, and human evaluation.
Related papers
- A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls [15.50296318831118]
We propose and evaluate the feasibility of a two-stage pipeline to evaluate literary machine translation.
Our framework provides fine-grained, interpretable metrics suited for literary translation.
arXiv Detail & Related papers (2024-12-02T10:07:01Z) - Language Models and Cycle Consistency for Self-Reflective Machine Translation [1.79487674052027]
We generate multiple translation candidates from a source language A to a target language B, and subsequently translate these candidates back to the original language A.
By evaluating the cycle consistency between the original and back-translated sentences using metrics such as token-level precision and accuracy, we implicitly estimate the translation quality in language B.
For each source sentence, we identify the translation candidate with optimal cycle consistency with the original sentence as the final answer.
arXiv Detail & Related papers (2024-11-05T04:01:41Z) - LLM-based Translation Inference with Iterative Bilingual Understanding [52.46978502902928]
We propose a novel Iterative Bilingual Understanding Translation method based on the cross-lingual capabilities of large language models (LLMs)
The cross-lingual capability of LLMs enables the generation of contextual understanding for both the source and target languages separately.
The proposed IBUT outperforms several strong comparison methods.
arXiv Detail & Related papers (2024-10-16T13:21:46Z) - What is the Best Way for ChatGPT to Translate Poetry? [38.47691441569612]
This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to ascertain optimal performance.
We propose an Explanation-Assisted Poetry Machine Translation (EAPMT) method, which leverages monolingual poetry explanation as a guiding information for the translation process.
The results from both human and machine evaluations demonstrate that our EAPMT method outperforms traditional translation methods of ChatGPT and the existing online systems.
arXiv Detail & Related papers (2024-06-05T16:48:26Z) - (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts [52.18246881218829]
We introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents.
To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP)
arXiv Detail & Related papers (2024-05-20T05:55:08Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - Does Transliteration Help Multilingual Language Modeling? [0.0]
We empirically measure the effect of transliteration on Multilingual Language Models.
We focus on the Indic languages, which have the highest script diversity in the world.
We find that transliteration benefits the low-resource languages without negatively affecting the comparatively high-resource languages.
arXiv Detail & Related papers (2022-01-29T05:48:42Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.