Towards Zero-Shot Multilingual Synthetic Question and Answer Generation
for Cross-Lingual Reading Comprehension
- URL: http://arxiv.org/abs/2010.12008v3
- Date: Fri, 28 May 2021 21:07:33 GMT
- Title: Towards Zero-Shot Multilingual Synthetic Question and Answer Generation
for Cross-Lingual Reading Comprehension
- Authors: Siamak Shakeri, Noah Constant, Mihir Sanjay Kale, Linting Xue
- Abstract summary: We propose a simple method to generate multilingual question and answer pairs on a large scale.
These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target languages.
- Score: 20.570539023748424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a simple method to generate multilingual question and answer pairs
on a large scale through the use of a single generative model. These synthetic
samples can be used to improve the zero-shot performance of multilingual QA
models on target languages. Our proposed multi-task training of the generative
model only requires the labeled training samples in English, thus removing the
need for such samples in the target languages, making it applicable to far more
languages than those with labeled data. Human evaluations indicate the majority
of such samples are grammatically correct and sensible. Experimental results
show our proposed approach can achieve large gains on the XQuAD dataset,
reducing the gap between zero-shot and supervised performance of smaller QA
models on various languages.
Related papers
- Scaling Laws for Multilingual Language Models [41.6318470003173]
A primary challenge in studying multilingual scaling is the difficulty of analyzing individual language performance due to cross-lingual transfer.
We introduce and validate a hypothesis that the test cross-entropy loss for each language family is determined solely by its own sampling ratio.
We derive a power-law relationship that links performance with dataset size, model size and sampling ratios.
arXiv Detail & Related papers (2024-10-15T20:29:38Z) - Synergistic Approach for Simultaneous Optimization of Monolingual, Cross-lingual, and Multilingual Information Retrieval [5.446052898856584]
This paper proposes a novel hybrid batch training strategy to improve zero-shot retrieval performance across monolingual, cross-lingual, and multilingual settings.
The approach fine-tunes multilingual language models using a mix of monolingual and cross-lingual question-answer pair batches sampled based on dataset size.
arXiv Detail & Related papers (2024-08-20T04:30:26Z) - Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment [39.94156255629528]
We evaluate a simple approach for zero-shot cross-lingual alignment.
Cross-lingually aligned models are preferred by humans over unaligned models.
A different-language reward model sometimes yields better aligned models than a same-language reward model.
arXiv Detail & Related papers (2024-04-18T16:52:36Z) - Multilingual Few-Shot Learning via Language Model Retrieval [18.465566186549072]
Transformer-based language models have achieved remarkable success in few-shot in-context learning.
We conduct a study of retrieving semantically similar few-shot samples and using them as the context.
We evaluate the proposed method on five natural language understanding datasets related to intent detection, question classification, sentiment analysis, and topic classification.
arXiv Detail & Related papers (2023-06-19T14:27:21Z) - UniMax: Fairer and more Effective Language Sampling for Large-Scale
Multilingual Pretraining [92.3702056505905]
We propose a new sampling method, UniMax, that delivers more uniform coverage of head languages while mitigating overfitting on tail languages.
We find that UniMax outperforms standard temperature-based sampling, and the benefits persist as scale increases.
arXiv Detail & Related papers (2023-04-18T17:45:50Z) - Multilingual Generative Language Models for Zero-Shot Cross-Lingual
Event Argument Extraction [80.61458287741131]
We present a study on leveraging multilingual pre-trained generative language models for zero-shot cross-lingual event argument extraction (EAE)
By formulating EAE as a language generation task, our method effectively encodes event structures and captures the dependencies between arguments.
Our proposed model finetunes multilingual pre-trained generative language models to generate sentences that fill in the language-agnostic template with arguments extracted from the input passage.
arXiv Detail & Related papers (2022-03-15T23:00:32Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Nearest Neighbour Few-Shot Learning for Cross-lingual Classification [2.578242050187029]
Cross-lingual adaptation using a simple nearest neighbor few-shot (15 samples) inference technique for classification tasks.
Our approach consistently improves traditional fine-tuning using only a handful of labeled samples in target locales.
arXiv Detail & Related papers (2021-09-06T03:18:23Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.