RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
- URL: http://arxiv.org/abs/2407.02552v1
- Date: Tue, 2 Jul 2024 17:42:30 GMT
- Title: RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
- Authors: John Dang, Arash Ahmadian, Kelly Marchisio, Julia Kreutzer, Ahmet Üstün, Sara Hooker,
- Abstract summary: We introduce a novel, scalable method for generating high-quality multilingual feedback data.
Our preference-trained model achieves a 54.4% win-rate against Aya 23 8B.
As a result of our study, we expand the frontier of alignment techniques to 23 languages covering half of the world's population.
- Score: 13.563021984882704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Preference optimization techniques have become a standard final stage for training state-of-art large language models (LLMs). However, despite widespread adoption, the vast majority of work to-date has focused on first-class citizen languages like English and Chinese. This captures a small fraction of the languages in the world, but also makes it unclear which aspects of current state-of-the-art research transfer to a multilingual setting. In this work, we perform an exhaustive study to achieve a new state-of-the-art in aligning multilingual LLMs. We introduce a novel, scalable method for generating high-quality multilingual feedback data to balance data coverage. We establish the benefits of cross-lingual transfer and increased dataset size in preference training. Our preference-trained model achieves a 54.4% win-rate against Aya 23 8B, the current state-of-the-art multilingual LLM in its parameter class, and a 69.5% win-rate or higher against widely used models like Gemma-1.1-7B-it, Llama-3-8B-Instruct, Mistral-7B-Instruct-v0.3. As a result of our study, we expand the frontier of alignment techniques to 23 languages covering half of the world's population.
Related papers
- Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - SambaLingo: Teaching Large Language Models New Languages [16.709876506515837]
We present a comprehensive investigation into the adaptation of LLMs to new languages.
Our study covers the key components in this process, including vocabulary extension and direct preference optimization.
We scale these experiments across 9 languages and 2 parameter scales.
arXiv Detail & Related papers (2024-04-08T19:48:36Z) - Extrapolating Large Language Models to Non-English by Aligning Languages [109.09051737966178]
Existing large language models show disparate capability across different languages.
In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages.
arXiv Detail & Related papers (2023-08-09T13:32:06Z) - PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B.
To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training.
Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z) - Romanization-based Large-scale Adaptation of Multilingual Language
Models [124.57923286144515]
Large multilingual pretrained language models (mPLMs) have become the de facto state of the art for cross-lingual transfer in NLP.
We study and compare a plethora of data- and parameter-efficient strategies for adapting the mPLMs to romanized and non-romanized corpora of 14 diverse low-resource languages.
Our results reveal that UROMAN-based transliteration can offer strong performance for many languages, with particular gains achieved in the most challenging setups.
arXiv Detail & Related papers (2023-04-18T09:58:34Z) - Massively Multilingual Shallow Fusion with Large Language Models [62.76735265311028]
We train a single multilingual language model (LM) for shallow fusion in multiple languages.
Compared to a dense LM of similar computation during inference, GLaM reduces the WER of an English long-tail test set by 4.4% relative.
In a multilingual shallow fusion task, GLaM improves 41 out of 50 languages with an average relative WER reduction of 3.85%, and a maximum reduction of 10%.
arXiv Detail & Related papers (2023-02-17T14:46:38Z) - Por Qu\'e N\~ao Utiliser Alla Spr{\aa}k? Mixed Training with Gradient
Optimization in Few-Shot Cross-Lingual Transfer [2.7213511121305465]
We propose a one-step mixed training method that trains on both source and target data.
We use one model to handle all target languages simultaneously to avoid excessively language-specific models.
Our proposed method achieves state-of-the-art performance on all tasks and outperforms target-adapting by a large margin.
arXiv Detail & Related papers (2022-04-29T04:05:02Z) - Multilingual Language Model Adaptive Fine-Tuning: A Study on African
Languages [19.067718464786463]
We perform multilingual adaptive fine-tuning (MAFT) on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent.
To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT.
Our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space.
arXiv Detail & Related papers (2022-04-13T16:13:49Z) - Multilingual Speech Translation with Efficient Finetuning of Pretrained
Models [82.22294901727933]
A minimalistic LNA (LayerNorm and Attention) finetuning can achieve zero-shot crosslingual and cross-modality transfer ability.
Our approach demonstrates strong zero-shot performance in a many-to-many multilingual model.
arXiv Detail & Related papers (2020-10-24T08:15:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.