Multilingual Sentence-Level Semantic Search using Meta-Distillation
Learning
- URL: http://arxiv.org/abs/2309.08185v1
- Date: Fri, 15 Sep 2023 06:22:37 GMT
- Title: Multilingual Sentence-Level Semantic Search using Meta-Distillation
Learning
- Authors: Meryem M'hamdi, Jonathan May, Franck Dernoncourt, Trung Bui, and
Seunghyun Yoon
- Abstract summary: multilingual semantic search is less explored and more challenging than its monolingual or bilingual counterparts.
We propose an alignment approach: MAML-Align, specifically for low-resource scenarios.
Our results show that our meta-distillation approach boosts the gains provided by MAML and significantly outperforms naive fine-tuning methods.
- Score: 73.69884850632431
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multilingual semantic search is the task of retrieving relevant contents to a
query expressed in different language combinations. This requires a better
semantic understanding of the user's intent and its contextual meaning.
Multilingual semantic search is less explored and more challenging than its
monolingual or bilingual counterparts, due to the lack of multilingual parallel
resources for this task and the need to circumvent "language bias". In this
work, we propose an alignment approach: MAML-Align, specifically for
low-resource scenarios. Our approach leverages meta-distillation learning based
on MAML, an optimization-based Model-Agnostic Meta-Learner. MAML-Align distills
knowledge from a Teacher meta-transfer model T-MAML, specialized in
transferring from monolingual to bilingual semantic search, to a Student model
S-MAML, which meta-transfers from bilingual to multilingual semantic search. To
the best of our knowledge, we are the first to extend meta-distillation to a
multilingual search application. Our empirical results show that on top of a
strong baseline based on sentence transformers, our meta-distillation approach
boosts the gains provided by MAML and significantly outperforms naive
fine-tuning methods. Furthermore, multilingual meta-distillation learning
improves generalization even to unseen languages.
Related papers
- Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - MINERS: Multilingual Language Models as Semantic Retrievers [23.686762008696547]
This paper introduces the MINERS, a benchmark designed to evaluate the ability of multilingual language models in semantic retrieval tasks.
We create a comprehensive framework to assess the robustness of LMs in retrieving samples across over 200 diverse languages.
Our results demonstrate that by solely retrieving semantically similar embeddings yields performance competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-11T16:26:18Z) - A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias [5.104497013562654]
We present an overview of MLLMs, covering their evolution, key techniques, and multilingual capacities.
We explore widely utilized multilingual corpora for MLLMs' training and multilingual datasets oriented for downstream tasks.
We discuss bias on MLLMs including its category and evaluation metrics, and summarize the existing debiasing techniques.
arXiv Detail & Related papers (2024-04-01T05:13:56Z) - MetaXLR -- Mixed Language Meta Representation Transformation for
Low-resource Cross-lingual Learning based on Multi-Armed Bandit [0.0]
We propose an enhanced approach which uses multiple source languages chosen in a data driven manner.
We achieve state of the art results on the NER task for the extremely low resource languages while using the same amount of data.
arXiv Detail & Related papers (2023-05-31T18:22:33Z) - LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine
Translation [94.33019040320507]
Multimodal Machine Translation (MMT) focuses on enhancing text-only translation with visual features.
Recent advances still struggle to train a separate model for each language pair, which is costly and unaffordable when the number of languages increases.
We propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.
arXiv Detail & Related papers (2022-10-19T12:21:39Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z) - MetaXL: Meta Representation Transformation for Low-resource
Cross-lingual Learning [91.5426763812547]
Cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages.
We propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from auxiliary languages to a target one.
arXiv Detail & Related papers (2021-04-16T06:15:52Z) - DICT-MLM: Improved Multilingual Pre-Training using Bilingual
Dictionaries [8.83363871195679]
Masked modeling (MLM) objective as key language learning objective.
DICT-MLM works by incentivizing the model to be able to predict not just the original masked word, but potentially any of its cross-lingual synonyms as well.
Our empirical analysis on multiple downstream tasks spanning 30+ languages, demonstrates the efficacy of the proposed approach.
arXiv Detail & Related papers (2020-10-23T17:53:11Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.