Related papers: Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

URL: http://arxiv.org/abs/2004.09813v2
Date: Mon, 5 Oct 2020 06:30:56 GMT
Title: Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Authors: Nils Reimers, Iryna Gurevych
Abstract summary: We present an easy and efficient method to extend existing sentence embedding models to new languages. This allows to create multilingual versions from previously monolingual models.
Score: 73.65237422910738
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We present an easy and efficient method to extend existing sentence embedding models to new languages. This allows to create multilingual versions from previously monolingual models. The training is based on the idea that a translated sentence should be mapped to the same location in the vector space as the original sentence. We use the original (monolingual) model to generate sentence embeddings for the source language and then train a new system on translated sentences to mimic the original model. Compared to other methods for training multilingual sentence embeddings, this approach has several advantages: It is easy to extend existing models with relatively few samples to new languages, it is easier to ensure desired properties for the vector space, and the hardware requirements for training is lower. We demonstrate the effectiveness of our approach for 50+ languages from various language families. Code to extend sentence embeddings models to more than 400 languages is publicly available.

Related papers

Extending the Subwording Model of Multilingual Pretrained Models for New Languages [31.702393348980735]
In this paper, we add new subwords to the SentencePiece tokenizer to apply a multilingual pretrained model to new languages. In our experiments, we segmented Inuktitut sentences into subwords without changing the segmentation of already pretrained languages.
arXiv Detail & Related papers (2022-11-29T06:55:34Z)
Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks. Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive. We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z)
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models [3.6878069324996616]
We introduce a method -- called WECHSEL -- to transfer English models to new languages. We use WECHSEL to transfer GPT-2 and RoBERTa models to 4 other languages.
arXiv Detail & Related papers (2021-12-13T12:26:02Z)
Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data. Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z)
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks. We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z)
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks. Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages. We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z)
Multilingual Translation with Extensible Multilingual Pretraining and Finetuning [77.33262578776291]
Previous work has demonstrated that machine translation systems can be created by finetuning on bitext. We show that multilingual translation models can be created through multilingual finetuning. We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance.
arXiv Detail & Related papers (2020-08-02T05:36:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.