Salute the Classic: Revisiting Challenges of Machine Translation in the
Age of Large Language Models
- URL: http://arxiv.org/abs/2401.08350v2
- Date: Wed, 17 Jan 2024 06:47:29 GMT
- Title: Salute the Classic: Revisiting Challenges of Machine Translation in the
Age of Large Language Models
- Authors: Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong,
Shuming Shi, Zhaopeng Tu
- Abstract summary: The evolution of Neural Machine Translation has been influenced by six core challenges.
These challenges include domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.
This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models.
- Score: 91.6543868677356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The evolution of Neural Machine Translation (NMT) has been significantly
influenced by six core challenges (Koehn and Knowles, 2017), which have acted
as benchmarks for progress in this field. This study revisits these challenges,
offering insights into their ongoing relevance in the context of advanced Large
Language Models (LLMs): domain mismatch, amount of parallel data, rare word
prediction, translation of long sentences, attention model as word alignment,
and sub-optimal beam search. Our empirical findings indicate that LLMs
effectively lessen the reliance on parallel data for major languages in the
pretraining phase. Additionally, the LLM-based translation system significantly
enhances the translation of long sentences that contain approximately 80 words
and shows the capability to translate documents of up to 512 words. However,
despite these significant improvements, the challenges of domain mismatch and
prediction of rare words persist. While the challenges of word alignment and
beam search, specifically associated with NMT, may not apply to LLMs, we
identify three new challenges for LLMs in translation tasks: inference
efficiency, translation of low-resource languages in the pretraining phase, and
human-aligned evaluation. The datasets and models are released at
https://github.com/pangjh3/LLM4MT.
Related papers
- Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large
Language Models in 167 Languages [86.90220551111096]
Training datasets for large language models (LLMs) are often not fully disclosed.
We present CulturaX, a substantial multilingual dataset with 6.3 trillion tokens in 167 languages.
arXiv Detail & Related papers (2023-09-17T23:49:10Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Chain-of-Dictionary Prompting Elicits Translation in Large Language Models [100.47154959254937]
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT)
We present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities.
arXiv Detail & Related papers (2023-05-11T05:19:47Z) - Dictionary-based Phrase-level Prompting of Large Language Models for
Machine Translation [91.57514888410205]
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting.
LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios.
We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts.
arXiv Detail & Related papers (2023-02-15T18:46:42Z) - Adaptive Machine Translation with Large Language Models [7.803471587734353]
We investigate how we can utilize in-context learning to improve real-time adaptive machine translation.
We conduct experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES)
arXiv Detail & Related papers (2023-01-30T21:17:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.