BigTranslate: Augmenting Large Language Models with Multilingual
Translation Capability over 100 Languages
- URL: http://arxiv.org/abs/2305.18098v3
- Date: Tue, 21 Nov 2023 11:01:24 GMT
- Title: BigTranslate: Augmenting Large Language Models with Multilingual
Translation Capability over 100 Languages
- Authors: Wen Yang, Chong Li, Jiajun Zhang, Chengqing Zong
- Abstract summary: We present BigTranslate, which adapts LLaMA that covers only 20 languages and enhances it with multilingual translation capability on more than 100 languages.
BigTranslate is built upon LLaMA-13B and it is optimized in three steps. First, we continue training LLaMA with massive Chinese monolingual data. Second, we continue training the model with a large-scale parallel dataset that covers 102 natural languages. Third, we instruct-tune the foundation model with multilingual translation instructions, leading to our BigTranslate model.
- Score: 47.99695189331567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) demonstrate promising translation performance
among various natural languages. However, many LLMs especially the open-sourced
ones, such as BLOOM and LLaMA, are English-dominant and support only dozens of
natural languages, making the potential of LLMs on language translation less
explored. In this work, we present BigTranslate which adapts LLaMA that covers
only 20 languages and enhances it with multilingual translation capability on
more than 100 languages. BigTranslate is built upon LLaMA-13B and it is
optimized in three steps. First, we continue training LLaMA with massive
Chinese monolingual data. Second, we continue training the model with a
large-scale parallel dataset that covers 102 natural languages. Third, we
instruct-tune the foundation model with multilingual translation instructions,
leading to our BigTranslate model. The preliminary experiments on multilingual
translation show that BigTranslate performs comparably with ChatGPT and Google
Translate in many languages and even outperforms ChatGPT in 8 language pairs.
We release the BigTranslate model and hope it can advance the research
progress.
Related papers
- LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages [36.52198103816494]
Large Language Models (LLMs) demonstrate remarkable translation capabilities in high-resource language tasks.
But their performance in low-resource languages is hindered by insufficient multilingual data during pre-training.
We conduct extensive multilingual continual pre-training on the LLaMA series models, enabling translation support across more than 100 languages.
arXiv Detail & Related papers (2024-07-08T14:18:28Z) - Could We Have Had Better Multilingual LLMs If English Was Not the Central Language? [4.655168524016426]
Large Language Models (LLMs) demonstrate strong machine translation capabilities on languages they are trained on.
Our study delves into Llama2's translation capabilities.
Our experiments show that the 7B Llama2 model yields above 10 BLEU when translating into all languages it has seen.
arXiv Detail & Related papers (2024-02-21T16:32:38Z) - Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages.
In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z) - PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B.
To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training.
Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z) - Chain-of-Dictionary Prompting Elicits Translation in Large Language Models [100.47154959254937]
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT)
We present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities.
arXiv Detail & Related papers (2023-05-11T05:19:47Z) - Beyond English-Centric Multilingual Machine Translation [74.21727842163068]
We create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages.
We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining.
Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.
arXiv Detail & Related papers (2020-10-21T17:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.