Learning Disentangled Semantic Representations for Zero-Shot
Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
- URL: http://arxiv.org/abs/2204.00996v1
- Date: Sun, 3 Apr 2022 05:26:42 GMT
- Title: Learning Disentangled Semantic Representations for Zero-Shot
Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
- Authors: injuan Wu, Shaojuan Wu, Xiaowang Zhang, Deyi Xiong, Shizhan Chen,
Zhiqiang Zhuang, Zhiyong Feng
- Abstract summary: Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource languages to low-resource languages in machine reading comprehension (MRC)
In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models.
- Score: 40.38719019711233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multilingual pre-trained models are able to zero-shot transfer knowledge from
rich-resource to low-resource languages in machine reading comprehension (MRC).
However, inherent linguistic discrepancies in different languages could make
answer spans predicted by zero-shot transfer violate syntactic constraints of
the target language. In this paper, we propose a novel multilingual MRC
framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to
disassociate semantics from syntax in representations learned by multilingual
pre-trained models. To explicitly transfer only semantic knowledge to the
target language, we propose two groups of losses tailored for semantic and
syntactic encoding and disentanglement. Experimental results on three
multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the
effectiveness of our proposed approach over models based on mBERT and XLM-100.
Code is available at:https://github.com/wulinjuan/SSDM_MRC.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Parameter-Efficient Cross-lingual Transfer of Vision and Language Models
via Translation-based Alignment [31.885608173448368]
Pre-trained vision and language models such as CLIP have witnessed remarkable success in connecting images and texts with a primary focus on English texts.
disparities in performance among different languages have been observed due to uneven resource availability.
We propose a new parameter-efficient cross-lingual transfer learning framework that utilizes a translation-based alignment method to mitigate multilingual disparities.
arXiv Detail & Related papers (2023-05-02T14:09:02Z) - A Simple and Effective Method to Improve Zero-Shot Cross-Lingual
Transfer Learning [6.329304732560936]
Existing zero-shot cross-lingual transfer methods rely on parallel corpora or bilingual dictionaries.
We propose Embedding-Push, Attention-Pull, and Robust targets to transfer English embeddings to virtual multilingual embeddings without semantic loss.
arXiv Detail & Related papers (2022-10-18T15:36:53Z) - Cross-Lingual Text Classification with Multilingual Distillation and
Zero-Shot-Aware Training [21.934439663979663]
Multi-branch multilingual language model (MBLM) built on Multilingual pre-trained language models (MPLMs)
Method based on transferring knowledge from high-performance monolingual models with a teacher-student framework.
Results on two cross-lingual classification tasks show that, with only the task's supervised data used, our method improves both the supervised and zero-shot performance of MPLMs.
arXiv Detail & Related papers (2022-02-28T09:51:32Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - How Phonotactics Affect Multilingual and Zero-shot ASR Performance [74.70048598292583]
A Transformer encoder-decoder model has been shown to leverage multilingual data well in IPA transcriptions of languages presented during training.
We replace the encoder-decoder with a hybrid ASR system consisting of a separate AM and LM.
We show that the gain from modeling crosslingual phonotactics is limited, and imposing a too strong model can hurt the zero-shot transfer.
arXiv Detail & Related papers (2020-10-22T23:07:24Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.