How Multilingual Are Large Language Models Fine-Tuned for Translation?
- URL: http://arxiv.org/abs/2405.20512v1
- Date: Thu, 30 May 2024 22:08:20 GMT
- Title: How Multilingual Are Large Language Models Fine-Tuned for Translation?
- Authors: Aquia Richburg, Marine Carpuat,
- Abstract summary: Fine-tuning large language models (LLM) on parallel text has been shown to outperform dedicated translation systems trained in a supervised fashion on much larger amounts of parallel data.
How does translation fine-tuning impact the MT capabilities of LLMs for zero-shot languages, zero-shot language pairs, and translation tasks that do not involve English?
We find that translation fine-tuning improves translation quality even for zero-shot languages on average, but that the impact is uneven depending on the language pairs involved.
- Score: 13.612090779277281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A new paradigm for machine translation has recently emerged: fine-tuning large language models (LLM) on parallel text has been shown to outperform dedicated translation systems trained in a supervised fashion on much larger amounts of parallel data (Xu et al., 2024a; Alves et al., 2024). However, it remains unclear whether this paradigm can enable massively multilingual machine translation or whether it requires fine-tuning dedicated models for a small number of language pairs. How does translation fine-tuning impact the MT capabilities of LLMs for zero-shot languages, zero-shot language pairs, and translation tasks that do not involve English? To address these questions, we conduct an extensive empirical evaluation of the translation quality of the TOWER family of language models (Alves et al., 2024) on 132 translation tasks from the multi-parallel FLORES-200 data. We find that translation fine-tuning improves translation quality even for zero-shot languages on average, but that the impact is uneven depending on the language pairs involved. These results call for further research to effectively enable massively multilingual translation with LLMs.
Related papers
- The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities [18.175795328685986]
Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality.
We perform an extensive translation evaluation on the LLaMA and Falcon family of models with model size ranging from 7 billion up to 65 billion parameters.
We observe a decline in the ability to perform formality steering, to produce technical translations through few-shot examples, and to perform document-level translation.
arXiv Detail & Related papers (2024-05-30T14:25:56Z) - GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators [45.49880507108965]
"GenTranslate" builds upon large language models to generate better results from diverse translation versions in N-best list.
Our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.
arXiv Detail & Related papers (2024-02-10T07:20:49Z) - Towards Boosting Many-to-Many Multilingual Machine Translation with
Large Language Models [47.39529535727593]
This paper focuses on boosting many-to-many multilingual translation of large language models (LLMs) with an emphasis on zero-shot translation directions.
We introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages.
Experimental results on ALMA, Tower, and LLaMA-2 show that our approach consistently improves translation performance.
arXiv Detail & Related papers (2024-01-11T12:11:30Z) - The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants.
This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z) - Towards the Next 1000 Languages in Multilingual Machine Translation:
Exploring the Synergy Between Supervised and Self-Supervised Learning [48.15259834021655]
We present a pragmatic approach towards building a multilingual machine translation model that covers hundreds of languages.
We use a mixture of supervised and self-supervised objectives, depending on the data availability for different language pairs.
We demonstrate that the synergy between these two training paradigms enables the model to produce high-quality translations in the zero-resource setting.
arXiv Detail & Related papers (2022-01-09T23:36:44Z) - Beyond English-Centric Multilingual Machine Translation [74.21727842163068]
We create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages.
We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining.
Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.
arXiv Detail & Related papers (2020-10-21T17:01:23Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.