Parameter-Efficient Cross-lingual Transfer of Vision and Language Models
via Translation-based Alignment
- URL: http://arxiv.org/abs/2305.03510v2
- Date: Sat, 28 Oct 2023 18:38:47 GMT
- Title: Parameter-Efficient Cross-lingual Transfer of Vision and Language Models
via Translation-based Alignment
- Authors: Zhen Zhang, Jialu Wang, Xin Eric Wang
- Abstract summary: Pre-trained vision and language models such as CLIP have witnessed remarkable success in connecting images and texts with a primary focus on English texts.
disparities in performance among different languages have been observed due to uneven resource availability.
We propose a new parameter-efficient cross-lingual transfer learning framework that utilizes a translation-based alignment method to mitigate multilingual disparities.
- Score: 31.885608173448368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained vision and language models such as CLIP have witnessed remarkable
success in connecting images and texts with a primary focus on English texts.
Despite recent efforts to extend CLIP to support other languages, disparities
in performance among different languages have been observed due to uneven
resource availability. Additionally, current cross-lingual transfer methods of
those pre-trained models would consume excessive resources for a large number
of languages. Therefore, we propose a new parameter-efficient cross-lingual
transfer learning framework that utilizes a translation-based alignment method
to mitigate multilingual disparities and explores parameter-efficient
fine-tuning methods for parameter-efficient cross-lingual transfer. Extensive
experiments on XTD and Multi30K datasets, covering 11 languages under
zero-shot, few-shot, and full-dataset learning scenarios, show that our
framework significantly reduces the multilingual disparities among languages
and improves cross-lingual transfer results, especially in low-resource
scenarios, while only keeping and fine-tuning an extremely small number of
parameters compared to the full model (e.g., Our framework only requires 0.16\%
additional parameters of a full-model for each language in the few-shot
learning scenario). The codes are available at
\url{https://github.com/eric-ai-lab/PECTVLM}. The codes are available at
\url{https://github.com/eric-ai-lab/PECTVLM}.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Cross-Lingual Transfer Learning for Phrase Break Prediction with
Multilingual Language Model [13.730152819942445]
Cross-lingual transfer learning can be particularly effective for improving performance in low-resource languages.
This suggests that cross-lingual transfer can be inexpensive and effective for developing TTS front-end in resource-poor languages.
arXiv Detail & Related papers (2023-06-05T04:10:04Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - A Simple and Effective Method to Improve Zero-Shot Cross-Lingual
Transfer Learning [6.329304732560936]
Existing zero-shot cross-lingual transfer methods rely on parallel corpora or bilingual dictionaries.
We propose Embedding-Push, Attention-Pull, and Robust targets to transfer English embeddings to virtual multilingual embeddings without semantic loss.
arXiv Detail & Related papers (2022-10-18T15:36:53Z) - Learning Disentangled Semantic Representations for Zero-Shot
Cross-Lingual Transfer in Multilingual Machine Reading Comprehension [40.38719019711233]
Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource languages to low-resource languages in machine reading comprehension (MRC)
In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models.
arXiv Detail & Related papers (2022-04-03T05:26:42Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.